Data analytics and machine learning 
in life sciences

← Go back to overview

Solution Accelerators for life sciences

Based on best practices from our work with the leading pharmaceutical and biotech organizations, we’ve developed Solution Accelerators for common analytics and machine learning use cases to save weeks or months of development time for your data scientists, engineers and analysts.

Real-world Evidence (RWE) Lakehouse

Real-world data (RWD) can be used to generate real-world evidence (RWE) which in turn provides pharmaceutical companies with new insights into patient health and drug efficacy outside of a clinical trial. This accelerator notebook helps you build a Lakehouse for Real-world Evidence on Databricks. We’ll show you how to ingest sample EHR data for a patient population, structure the data using the OMOP common data model and then run analyses at scale like investigating drug prescription patterns.


One of the most powerful tools for identifying patients at risk for a chronic condition is the analysis of real world data (RWD). This solution accelerator notebook provides a template for building a machine learning model that assesses the risk of a patient for a given condition within a given window of time based on a patient’s encounter history and demographics information.

Digital Pathology Automation

Modern imaging technologies enable healthcare providers to rapidly digitize high-resolution pathology slides. These large data sets can be used to build automated diagnostics with machine learning that, in turn, help providers improve the efficiency and effectiveness of diagnosing cancer and infectious disease. This solution accelerator provides an automated methodology for rapidly identifying regions of metastases in whole slide images with deep learning.

Association Studies

Genome-wide association studies help identify genetic variations that are associated with a particular disease. This information can be used to better detect, treat and prevent chronic conditions such as asthma, cancer, diabetes and heart disease. This solution accelerator and open-source project provides a new scalable method for whole genome regressions.

Oncology Real-World Data Extraction

Contained within unstructured text-based pathology reports is critical information that can be used to improve oncology research and treatment. Our joint solution accelerator with John Snow Labs makes it easy to generate oncology insights from real-world data using natural language processing (NLP).

Ready to Get Started?