Gradient Joel filipe GVV669 GV Yjo unsplash stripe

Innovative Medical R&D Insights Using Machine Learning with Gedeon Richter

Practice Area

  • Data Science

Business Impact

  • Improved insights in product R&D

Challenges

  • Automatically and reliably detect mitochondria in high-resolution 3D confocal microscopy images
  • Quantify the properties of mitochondrial networks based on segmentation
  • Assess the health of cells  based on the properties of the mitochondrial network
  • Lack of established technology

Technology

  • Python

Background

Gedeon Richter, a multinational pharmaceutical and biotechnology company headquartered in Hungary, operates the largest pharmaceutical research center in the region. As a company dedicated to innovation, they are constantly seeking ways to improve the quality of their solutions.

Part of Richter’s R&D work includes finding solutions to quantify the properties of the mitochondrial network within neurons to enable more effective analysis of medications for various neurological diseases (mitochondria serve as the combustion engine of the cell, and their malfunctioning can cause network-level problems leading to dementia and other related issues). There is a well-established methodology for quantifying the properties of the mitochondrial network in two dimensions, but because the neuronal network is elaborate and thick, 2D imaging is not sufficient for capturing all necessary information. To get past the limitations of 2D imaging, Richter initiated a project with Starschema to use data science to build a solution that captures information at greater levels of specificity.

Challenge

Richter identified three key deliverables for the project that would enable them to gain access to the information needed to optimize their products. First, the team had to develop a methodology for the automatic and reliable detection of mitochondria in high-resolution 3D confocal microscopy images – a process commonly known as segmentation. Second, they had to find a way to quantify the properties of a mitochondrial network based on the segmentation. Finally, the solution had to enable researchers to tell if the cells are healthy or damaged based on the properties of the mitochondrial network.

The project represented a relatively new field, with very little established technology to rely on. A major reason for this is that 3D image capturing is not very widespread. Since 3D image capture is used predominantly for medical purposes, the range of tools, packages and methodologies suitable for this use case are narrow. Even transforming a 3D image from one format to another is a complicated process. This made it necessary to develop completely new toolsets to reach the desired goals, and the project took shape as an R&D proof of concept program involving a lot of experimentation.

Solution

The Richter-Starschema team used Richter’s IT environment as a reference to develop a methodology that biologists can leverage without data science and programming knowledge. With this requirement in mind, the team designed a solution that would remain flexible enough to serve future needs by allowing biologists to make minor modifications. The team focused primarily on the task of quantifying the properties of a mitochondrial network and using it to assess the health of the cells. Given that there is no currently available toolset for this purpose, successful development would open up unprecedented abilities for researchers and likely lead to great advancements in the field.

20201016 20200902 RTS11 neuron H2 O2 B8 ims1 2021 02 10 T14 30 03 809 2021 03 30 T11 06 55 922

We visualize the mitochondria in neuronal culture with a specific mitochondrial marker in green using confocal laser scanning microscopy, then reconstruct the acquired Z-slices in 3D. Note the different morphologies of the mitochondria present in the cells.

The team built a machine learning model that accommodates the fact that both healthy and damaged cells contain all types of mitochondria with different geometrical properties – only in different ratios. To accommodate this, the team developed robust and interpretable algorithms that would be capable of handling the “noisy” nature of the training data. Considering these requirements, a small decision tree that enabled exact rules and ratios of the classes for all leaves in it was used. Each class entails similarly-shaped mitochondria, and the decision tree looks for partitions where there are almost only healthy or damaged samples in order to enable biologists to validate that the identified patterns reveal real relationships. While perfect differentiating ability based on samples that inevitably contain an overlap between both mitochondria types can’t be expected, the algorithm was still able to find patterns that are more typical either in healthy or in damaged mitochondrial networks. The team validated the results through a test dataset for a series of 3D images created especially for the purposes of this project.

Throughout the project, biologists from Richter contributed deep domain knowledge that was essential to the project’s success. They conducted all experiments and provided the 3D confocal microscopy images, pre-screening the samples so that algorithms could be trained on samples where the impairment was clear. Richter biologists also identified the targets that the new methodology would have to meet and evaluated the masks – binary versions of an image showing the location of only mitochondria in the image – and the different patterns that were learned from the data.

Render3 0

This is an example of a mask where mitochondria are colored according to their morphological classes. The blue ones are elongated and are more typical in healthy samples, while the red and green classes, which are smaller and more fragmented here, have higher ratios in damaged cells.

The solution took shape as Python code that enables the calculation of the geometrical features of mitochondria and a decision tree model/machine learning algorithm that had learned the differences in the main patterns of healthy and damaged cells. The code and algorithm provide a suitably future-proof solution, as they do not require any manual setup before execution to classify new images. After the Starschema team handed over the original code, the Richter team used a separate in-house data set to create the final model that the company would rely on for further research.

Outcome

In two months, Richter and Starschema created the first fully-functioning, validated 3D imagery machine learning toolset for quantifying the properties of a mitochondrial network and deciding if the cells are healthy or impaired. The team also saw promising results in creating an automatic system for detecting mitochondria in high-resolution 3D confocal microscopy images. While a larger dataset and more effective image post-processing will be necessary to fully realize this part of the solution and make it generally applicable, the methodology created represents a major advancement compared to earlier solutions when used on Richter’s data.

Compared to solutions based on 2D images, the new methodology does not require experts’ rules for classification but can instead learn patterns from data and utilize information previously unavailable to biologists. Because the system works in an interpretable way, experts can analyze the revealed patterns in detail and validate their efficacy in differentiating the classes. As a result, Richter can apply the machine learning model on large image sets to automatically and effectively judge if a particular treatment has been successful and has concluded that this tool will be essential in evaluating the effects of drug candidates on neuronal mitochondrial networks.

“Why Did This Happen?” New Horizons in Root Cause Analysis

Learn about core concepts of root cause analysis, the advantages and disadvantages of the most popular tools and techniques in the field and find out what the cutting-edge looks like.

Telco Location Data Monetization

A global telecommunications company opened a new revenue stream and made it profitable in just two years.

Automating BI Analytical Tasks with Anomaly Detection and NLG Summation

Learn how to design and implement a complex solution that automatically identifies anomalies in organizational data, provides relevant context and communicates it all in an easy-to-consume form to augment analysts' work.

Effective Location Data Monetization: Strategic and Technical Enablers

Geolocation data provides invaluable insights into the habits and preferences of users, customers and audiences. This white paper helps understand the fundamental opportunities and challenges inherent in using location data for business-critical processes in any industry.