Organizations Can Seamlessly Combine Third-party Data with Existing Data Lakes to Perform Advanced Data Science and Analytics at Scale on AWS
SAN FRANCISCO–(BUSINESS WIRE)–Databricks, the leader in unified data analytics, today announced API integration with AWS Data Exchange, a new service that makes it easy for millions of Amazon Web Services (AWS) customers to securely find, subscribe to, and use third-party data in the cloud. By integrating the AWS Data Exchange into Databricks Unified Data Analytics Platform, organizations can rapidly combine their internal data with third-party data sets from data providers and seamlessly perform data science and advanced analytics to gain deeper insights. In addition, data providers who use Databricks can experience a streamlined approach to creating, managing and delivering their data sets for consumption by data subscribers.
“AWS Data Exchange allows customers to easily find, subscribe to, and use third party data in a cloud-native way,” said Noah Schwartz, Head of Engineering, AWS Data Exchange, Amazon Web Services, Inc. “We’re delighted to work with Databricks to help customers securely and easily find and use data for research, analytics, and training machine learning models to optimize their business through data-driven decisions.”
To stay competitive, organizations want to gain deeper insights from data, but they continue to encounter time-consuming roadblocks. Organizations struggle to obtain high quality third-party data, and then gather, load, organize, and manage that data. Further, once organizations complete this process, applying data science and managing that lifecycle creates an added level of complexity. This integration between the AWS Data Exchange API and Databricks, which includes pre-built notebooks, makes it easy for data subscribers to onboard, test, transform, and combine data from data providers.
“Organizations combine third party data with existing data lakes to take advantage of a holistic data set that can drive to deeper insights. AWS Data Exchange can empower organizations to easily discover, purchase and ingest new data sets from data providers,” said Pankaj Dugar, vice president of Technology and Data Provider Partnerships, Databricks. “In combination with our Unified Data Analytics Platform, organizations now have the ability to combine internal and third-party data sets and apply data science and advanced analytics.”
Databricks enables thousands of organizations to make data-driven decisions. By streamlining the process of onboarding and analyzing third party data advanced analytics and machine learning scenarios are easier to implement, for example applying customer purchase data to refine offers, utilizing weather data for inventory management, or optimizing store placement and offers from customer location data.
“At SafeGraph, enabling organizations with the most accurate Points-of-Interest (POI) data, business listings, and store visitor insights data is our core business,” said Jonathan Wolf, Head of Partnerships at SafeGraph. “We are excited to have our data offerings in AWS Data Exchange, and to partner with Databricks and leverage their Unified Data Analytics Platform to create our unique datasets.”
Databricks support for the AWS Data Exchange is available immediately. For more information, visit www.databricks.com/aws-data-exchange. To learn more about blending third party datasets to gain greater customer purchase insights, register now for this webinar, “Building Reliable Data Pipelines for Machine Learning at SafeGraph,” featuring Databricks and SafeGraph.
Databricks helps data teams solve the world’s toughest problems. As the leader in Unified Data Analytics, Databricks helps organizations make all their data ready for analytics, empower data-driven decisions across the organization, and rapidly adopt machine learning to outpace the competition. The company’s global customer base has thousands of organizations including Comcast, Shell, Expedia, and Regeneron. Databricks is venture-backed and founded by the original creators of popular open source projects, including Apache Spark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook.
Apache, Apache Spark and Spark are trademarks of the Apache Software Foundation.