Informatica Announces Industry’s Most Comprehensive Big Data Management Solution for Apache Spark-Based Big Data Clouds
- Empowering Data Engineers, Data Science Teams, and Data Operations with New Innovations in AI-driven Data Lake Management
- Cloud ready AI-driven big data management and big data streaming on AWS, Azure and Google Cloud Platform
- Governed self-service data discovery, data cataloging and data preparation for all users
- Serverless big data integration with auto-scaling and tuning for massive scalability, execution optimization, dramatic cost reduction and increased efficiencies
Informatica®, the enterprise cloud data management leader, announced the next generation of the industry’s most comprehensive data management solution for Apache Spark Based Big Data Cloud environments. These new innovations, powered by the CLAIRE™ engine, enable organizations to stream, ingest, process, cleanse, protect, and govern even more big data with less effort.
The new AI-driven hybrid big data management solution delivers more trusted information assets and accelerates self-service analytics using machine learning for hybrid and multi-cloud environments. The solution will help organizations overcome the challenges of managing and governing large data volumes flowing into and through data lake environments on-premises and in the cloud.
The new innovations include:
- Increasing data engineering productivity with even broader support for big data clouds like Google Cloud Dataproc and new advanced Spark serverless based integrations with Qubole and Azure Databricks. Additionally, users will benefit from rapid development for IoT data pipelines with machine-learning driven structure discovery of semi-structured datasets (e.g. machine data).
- Empowering data analyst and data science teams with advanced self-service data discovery and data preparation with 50+ new functions. Examples include statistical and windowing functions, fuzzy clustering, matching rules, more controlled access to data using data masking and the ability to ingest logical models and business terms into a data catalog.
- Optimizing data operations with improved monitoring of data infrastructure with machine learning driven operational management and proactive actions and recommendations.
- “The Qubole and Informatica partnership enables organizations to design advanced data management pipelines including integration, data quality, masking, and more and execute them in a self-service cloud native platform for end-to-end big data processing,” said Ashish Thusoo, co-founder and CEO, Qubole. “The partnership provides customers accelerated time to value and success adopting next-generation analytics.”
- “Big data management is going through a wave of innovation that empowers data operations teams to efficiently and effectively collaborate and interact with large volumes of company data for crucial analytics projects,” said Ronen Schwartz, senior vice president and general manager, Cloud, Big Data and Data Integration, Informatica. “The new Informatica innovations empower all levels of data users to interact with huge data sets to glean insights. For example, data engineers can now build serverless data pipelines running on Apache Spark in the Cloud and provide data scientists with advanced self-service data prep, powered by AI and machine learning. In addition, on September 11, 2018, Informatica won the Cloudera Partner Impact Award for driving customer insights, furthering the notion that our big data innovations are delivering transformational value for our customers.”