Databricks’ Open Source Genomics Toolkit Outperforms Leading Tools
Genomic technologies are driving the creation of new therapeutics, from RNA vaccines to gene editing and diagnostics. Progress in these areas motivated us to build Glow, an open-source toolkit for genomics machine learning and data analytics. The toolkit is natively built on Apache Spark™, the leading engine for big data processing, enabling population-scale genomics. The...
Extracting Oncology Insights From Real-World Clinical Data With NLP
Preview the solution accelerator notebooks referenced in this blog online or get started right away by downloading and importing the notebooks into your Databricks account. Cancer is the leading cause of death and disease in the U.S., and the numbers are staggering with nearly 2 million new cases of cancer expected to be diagnosed in...
Timeliness and Reliability in the Transmission of Regulatory Reports
Managing risk and regulatory compliance is an increasingly complex and costly endeavour. Regulatory change has increased 500% since the 2008 global financial crisis and boosted the regulatory costs in the process. Given the fines associated with non-compliance and SLA breaches (banks hit an all-time high in fines of $10 billion in 2019 for AML), processing...
Improving On-Shelf Availability for Items With AI Out of Stock Modeling
This post was written in collaboration with Databricks partner Tredence. We thank Rich Williams, Vice President Data Engineering, and Morgan Seybert, Chief Business Officer, of Tredence for their contributions. Retailers are missing out on nearly $1 trillion in global sales because they don’t have on-hand what customers want to buy in their stores. Adding...
Solution Accelerator: Multi-touch Attribution
Behind the growth of every consumer-facing product is the acquisition and retention of an engaged user base. When it comes to customer acquisition, the goal is to attract high-quality users as cost effectively as possible. With marketing dollars dispersed across a wide array of different touchpoints -- campaigns, channels, and creatives -- measuring effectiveness is...
Unlocking the Power of Health Data With a Modern Data Lakehouse
A single patient produces approximately 80 megabytes of medical data every year. Multiply that across thousands of patients over their lifetime, and you’re looking at petabytes of patient data that contains valuable insights. Unlocking these insights can help streamline clinical operations, accelerate drug R&D and improve patient health outcomes. But first, the data needs to...
AML Solutions at Scale Using Databricks Lakehouse Platform
Anti-Money Laundering (AML) compliance has been undoubtedly one of the top agenda items for regulators providing oversight of financial institutions across the globe. As AML evolved and became more sophisticated over the decades, so have the regulatory requirements designed to counter modern money laundering and terrorist financing schemes. The Bank Secrecy Act of 1970 provided...
Solution Accelerator: Toxicity Detection in Gaming
Across massively multiplayer online video games (MMOs), multiplayer online battle arena games (MOBAs) and other forms of online gaming, players continuously interact in real time to either coordinate or compete as they move toward a common goal -- winning. This interactivity is integral to game play dynamics, but at the same time, it’s a prime...
How to Build a Scalable Wide and Deep Product Recommender
Download the notebooks referenced throughout this article. I have a favorite coffee shop I’ve been visiting for years. When I walk in, the barista knows me by name and asks if I’d like my usual drink. Most of the time, the answer is “yes”, but every now and then, I see they have some seasonal...
Machine Learning-based Item Matching for Retailers and Brands
Item matching is a core function in online marketplaces. To ensure an optimized customer experience, retailers compare new and updated product information against existing listings to ensure consistency and avoid duplication. Online retailers may also compare their listings with those of their competitors to identify differences in price and inventory. Suppliers making products available across...