Implementing More Effective FAIR Scientific Data Management With a Lakehouse
Data powers scientific discovery and innovation. But data is only as good as its data management strategy, the key factor in ensuring data quality, accessibility, and reproducibility of results – all requirements of reliable scientific evidence. As large datasets have become more and more important and accessible to scientists across disciplines, the problems of big...
How Incremental ETL Makes Life Simpler With Data Lakes
Incremental ETL (Extract, Transform and Load) in a conventional data warehouse has become commonplace with CDC (change data capture) sources, but scale, cost, accounting for state and the lack of machine learning access make it less than ideal. In contrast, incremental ETL in a data lake hasn’t been possible due to factors such as the...
How to Manage End-to-end Deep Learning Pipelines with Databricks
Deep Learning (DL) models are being applied to use cases across all industries -- fraud detection in financial services, personalization in media, image recognition in healthcare and more. With this growing breadth of applications, using DL technology today has become much easier than just a few short years ago. Popular DL frameworks such as Tensorflow...
Solution Accelerator: Multi-touch Attribution
Behind the growth of every consumer-facing product is the acquisition and retention of an engaged user base. When it comes to customer acquisition, the goal is to attract high-quality users as cost effectively as possible. With marketing dollars dispersed across a wide array of different touchpoints -- campaigns, channels, and creatives -- measuring effectiveness is...
Make Your RStudio on Databricks More Durable and Resilient
One of the questions that we often hear from our customers these days is, “Should I develop my solution in Python or R?” There is no right or wrong answer to this question, as it largely depends on the available talent pool, functional requirements, availability of packages that fit the problem domain and many other...
Four E-commerce Challenges That Can Be Addressed With Data + AI
The global health crisis accelerated the adoption of omnichannel shopping and fulfillment. Consumers spent $861.12 billion online with US merchants in 2020, up an incredible 44% compared to the previous year, which marks the highest annual growth in U.S. e-commerce in at least two decades. To keep up pace with this shift and more effectively...
Applying Natural Language Processing to Healthcare Text at Scale
This is a co-authored post written in collaboration with Moritz Steller, AI Evangelist, at John Snow Labs. Don't miss our virtual workshop, Extract Real-World Data with NLP, on July 15 to learn about our new NLP solutions. In 2015, HIMSS estimated that the healthcare industry in the U.S. produced 1.2 billion clinical documents. That’s a...
How to Build a Scalable Wide and Deep Product Recommender
Download the notebooks referenced throughout this article. I have a favorite coffee shop I’ve been visiting for years. When I walk in, the barista knows me by name and asks if I’d like my usual drink. Most of the time, the answer is “yes”, but every now and then, I see they have some seasonal...
Jump Start Your Data Projects with Pre-Built Solution Accelerators
Deliver value faster. We hear this theme in nearly every executive discussion with customers. Data teams and data leaders need to deliver value in weeks, not months or years. The business climate is volatile, and they don’t have the luxury of long project timelines to deliver data and analytic capabilities designed to drive business value, such...
Fine-Grained Time Series Forecasting at Scale With Facebook Prophet and Apache Spark: Updated for Spark 3
Advances in time series forecasting are enabling retailers to generate more reliable demand forecasts. The challenge now is to produce these forecasts in a timely manner and at a level of granularity that allows the business to make precise adjustments to product inventories. Leveraging Apache Spark™ and Facebook Prophet, more and more enterprises facing these...