Artificial Intelligence | News | Insights | AiThority
[bsfp-cryptocurrency style=”widget-18″ align=”marquee” columns=”6″ coins=”selected” coins-count=”6″ coins-selected=”BTC,ETH,XRP,LTC,EOS,ADA,XLM,NEO,LTC,EOS,XEM,DASH,USDT,BNB,QTUM,XVG,ONT,ZEC,STEEM” currency=”USD” title=”Cryptocurrency Widget” show_title=”0″ icon=”” scheme=”light” bs-show-desktop=”1″ bs-show-tablet=”1″ bs-show-phone=”1″ custom-css-class=”” custom-id=”” css=”.vc_custom_1523079266073{margin-bottom: 0px !important;padding-top: 0px !important;padding-bottom: 0px !important;}”]

AiThority Interview with Carolyn Duby, Field CTO and Cyber Security GTM Lead at Cloudera

Hi Carolyn, welcome to our AiThority Interview Series. As a seasoned technology expert, can you share your journey with us?

Certainly. My career has been quite diverse— spanning from high-performance medical devices to cybersecurity and distributed big data solutions. That said, no matter the focus area, I’ve always had a fascination with technology. This began in high school, where I had early exposure to computer programming. This passion led me to earn both a bachelor’s and a master’s degree at Brown University, where I built a solid foundation in technology and the liberal arts.

My professional journey started at Cadre Technologies, where I honed in my software engineering skills and formed lasting relationships in the industry. Later, I co-founded Pathfinder Solutions, wearing multiple hats and broadening my skill set. Then, at Dell SecureWorks, I specialized in real-time cybersecurity ingestion pipelines, deepening my understanding of reliability engineering and team leadership.

Transitioning to Cloudera (previously Hortonworks), I worked as a Solutions Engineer which allowed me to leverage big data for business transformation— I found helping customers transform their capabilities with data especially liberating. Today, as a Field CTO, I advise our clients on how to best leverage their data with AI-driven transformations.

Please tell us about your Cloud AI product launches.

Without good data there is no AI. Cloudera is the foundation for building and deploying AI applications and our customers are already using Gen AI for mission-critical use cases today, including chat bots, document summarization, and code generation.

Our AI-driven solutions, such as Cloudera Machine Learning (CML), have recognized the transformative power of machine learning and AI since the beginning, focusing on addressing enterprises’ challenges in deploying production-grade solutions.

CML serves as a scalable and readily available collaboration platform, empowering engineers to explore data visually, conduct experiments, and seamlessly deploy and monitor models in production. Notably, pioneering enterprises like OCBC Bank have successfully leveraged CML to deploy production-grade Generative AI solutions securely, safely, and cost-effectively. It stands out as a hybrid multi-cloud platform as it also offers unparalleled flexibility for running AI applications securely and cost-effectively across various platforms, whether on-premise or in the cloud. By enabling customers to choose AI services and foundation models tailored to their specific use cases, we’re helping them future proof their applications and minimize technical debt.

Cloudera is deepening its AI expertise further with our recent acquisition of Verta’s Operational AI Platform. Together with Verta’s GenAI workbench and governance tools, these features will amplify Cloudera’s platform, advancing enterprise AI adoption worldwide.

We also partner with AI leaders such as AWS, NVIDIA, Hugging Face, IBM and Pinecone to help accelerate our customers’ AI journeys, allowing them to build faster and deliver value right away from their trusted data.

In addition to our robust product offerings, we provide a comprehensive suite of capabilities to kickstart AI projects, including Applied Model Prototypes (AMPs) and the Enterprise AI Fast Start service offering to accelerate time-to-value for our customers, demonstrating our unwavering commitment to their success.

Could you share insights for Enterprise AI that currently command your attention and the rationale behind their prioritization?

My primary focus revolves around aiding customers in constructing AI applications that are not only efficient but also foster trust, security, and bolster brand reputation. Specifically, I’m deeply engrossed in platforms facilitating the swift implementation of cutting-edge cyber AI defense mechanisms. Given the escalating frequency and sophistication of cyber attacks, it’s imperative for enterprises to swiftly deploy advanced AI-driven analytics capable of swiftly detecting, investigating, and mitigating emerging threats. Cloudera stands out with its distinctive multi-functional capabilities, facilitating real-time ingestion and AI-driven analysis of vast volumes of cybersecurity logs, including those pertaining to DNS activity. This capability is pivotal in enabling enterprises to proactively identify and respond to novel cyber threats, thereby fortifying their cybersecurity posture and safeguarding critical assets.

How did NVIDIA integration empower Cloudera to use data-driven insights to power mission-critical use cases such as fraud detection?

NVIDIA’s integration has significantly empowered Cloudera to use data-driven insights for critical tasks like fraud detection. By combining NVIDIA RAPIDS libraries for Apache Spark with NVIDIA-Certified Systems equipped with GPUs, Cloudera has drastically enhanced its data processing capabilities. This development means the IRS can now analyze vast amounts of data much more quickly, transforming tasks that used to take weeks or months into jobs that can be completed in days, hours, or even minutes. Such speed is crucial for timely fraud detection and prevention.

Additionally, the collaboration includes integrating NVIDIA’s AI Enterprise software platform and microservices, like NIM and CUDA-X, into Cloudera Machine Learning. Beyond providing powerful generative AI capabilities and performance, this integration enables enterprises to make more accurate and timely decisions, reducing inaccuracies, hallucinations, and errors in predictions—critical for navigating today’s data landscape across industries.

In the context of the IRS, this setup supports the creation and deployment of high-performance AI and machine learning models, which are essential for accurately analyzing complex data patterns and identifying potential fraud. The scalable and robust infrastructure provided by NVIDIA also ensures that these models can handle growing volumes of data without sacrificing performance or accuracy, which is vital for effective fraud detection.

Recent testing has also shown remarkable improvements in workflow efficiency, with a tenfold increase in speed and a 50% reduction in infrastructure costs. These efficiencies enable the agencies like the IRS to use its resources more effectively, allowing for more thorough and frequent data analysis. This is critical for identifying and mitigating fraudulent activities.

By leveraging GPU-accelerated data processing and advanced AI tools, Cloudera and NVIDIA have also enabled the IRS to create and analyze large graphs that connect individuals to institutions and larger entities over time. The integration of NVIDIA’s AI Enterprise with Cloudera Machine Learning simplifies the creation of end-to-end generative AI workflows, significantly enhancing productivity and data-driven decision-making. This capability is key to uncovering complex fraud schemes that might otherwise go unnoticed.

Could you please brief me on your recent project of open data lakehouse on a private cloud?

Cloudera has recently announced significant advancements in its open data lakehouse on private clouds. This development is designed to revolutionize how enterprises manage their on-premises data for trusted analytics and AI at scale. The core enhancement includes the integration of Apache Iceberg, an open-source, high-performance table format, now available for both public and private clouds. This positions Cloudera uniquely as the sole provider of this comprehensive solution.

The motivation behind these enhancements stems from the current landscape where a substantial portion of U.S. organizations (53%) are leveraging Generative AI technology, while many others are in the exploratory phase. Despite this interest, enterprises face considerable hurdles in deriving business value from their data, often due to fragmented data infrastructures, governance risks, or security concerns. Cloudera’s integration of Apache Iceberg aims to address these challenges, enabling organizations to efficiently scale their AI deployments and derive meaningful insights from their data, whether it’s stored in the cloud or on-premises.

A standout feature of this update is the ability to run a completely air-gapped large language model (LLM) deployment. This ensures enhanced security and data privacy while also improving performance and reducing operational costs. Alongside this, Cloudera has introduced several other updates, including zero downtime upgrades (ZDU), advanced security measures like TLS 1.2, and new capabilities within Apache Ozone for improved scalability and cost efficiency. Additionally, expanded support for various integrations enhances compatibility and flexibility for enterprises.

It’s been exciting receiving great feedback from users on these advancements who have highlighted the importance of an open table format for ease of data access and maintenance— ensuring that data ownership remains with the enterprise, independent of technology shifts. Overall, it represents a significant leap forward in helping organizations unlock the full potential of their data with AI, providing a secure, scalable, and flexible platform for innovative AI applications.

How should young technology professionals train themselves to work better with the Cloud AI platforms?

Young technology professionals looking to enhance their skills in working with Cloud AI platforms should prioritize hands-on experience. Engaging directly with the tools and platforms is the most effective way to learn and understand the intricacies of AI. There are numerous online courses available that are either free or low-cost, providing accessible learning opportunities. Some platforms I highly recommend include Coursera and EdX, which offer a wide range of AI-related courses designed by reputable institutions.

Additionally, pursuing certification paths from major cloud service providers like AWS can be extremely beneficial. These certifications provide structured learning paths and validate your skills, making you more marketable in the job market.

One notable course to consider is Cloudera’s “Introduction to Cloudera Machine Learning.” This course offers a comprehensive overview of the Cloudera Machine Learning (CML) platform, delivered through video lectures and demonstrations. It covers essential topics such as recognizing the primary capabilities and suitable applications of CML, navigating the user account and project contexts, and utilizing the CML workbench to view and edit code files.

The Cloudera course is ideal for those who want to gain a thorough understanding of the CML environment and how to effectively navigate its interface. No prior experience is required, making it accessible for beginners who are keen to develop their AI and machine learning skills. By combining these educational resources with hands-on practice, young professionals can build a solid foundation in Cloud AI platforms and enhance their proficiency in this rapidly evolving field.

What is your perception of cloud-based AI services? How do they offer a competitive advantage?

Cloud-based AI services bring a lot of advantages to the table. Of note, they allow businesses to quickly implement generative AI features and versatile models that work well across various applications. Since there’s no need to maintain any infrastructure, companies can scale efficiently and only pay for what they use, making these services both cost-effective and flexible.

However, there are some limitations to keep in mind. For instance, sending sensitive customer data to a third-party service can raise security concerns. Costs can also add up, especially with extensive or high-frequency use. Additionally, relying on a third-party provider for critical AI services can be risky. Providers might update their APIs or even discontinue services, which could disrupt your operations.

At Cloudera we work to understand these challenges and support your choice of a public cloud AI or a foundation model. Either way, it is so important to develop solutions that address these challenges so that customers can take full advantage of the benefits and lesson concerns around security, cost, and service reliability.

As stated by Cloudera, 53% of organizations in the U.S. currently use Generative AI. Looking beyond 2025, where do you see AI heading?

AI capabilities, along with best practices, are evolving rapidly and revolutionizing virtually every industry. Over the next few years, enterprises with mature and sustainable data management programs will deliver AI solutions earlier and more effectively. Conversely, organizations with fledgling or siloed data management programs will need to mature their data strategies or risk being overtaken by smaller, more nimble competitors leveraging data and automation.

Less mature data programs, while not burdened by legacy systems, often lack the experience in data governance, security, and literacy necessary to make effective use of their data. As businesses navigate this powerful new technology and evolving regulations, we will witness both tremendous successes and spectacular failures that will impact brand loyalty. Brands that work earnestly and deliberately to deliver significant business value safely, and with empathy towards customers and employees, will grow.

Beyond 2025, efficiency gains will enable enterprises to deliver more capable AI-driven systems by training smaller, custom models with proprietary high-quality data and assembling AI agents into fully or mostly autonomous systems. Security teams will face new challenges with the rise in realistic phishing emails and scams involving deep fakes.

Looking further ahead, I am particularly excited about the potential of Quantum computing and how it could revolutionize AI. This technology promises to bring unprecedented advancements, pushing the boundaries of what AI can achieve.

Tag one person in the AI/ML sector, whose answers to these questions you would love to read.

I would love to hear from our customers on their AI journey.

Thank you, Carolyn! That was fun and we hope to see you back on soon.

Cloudera delivers an enterprise data cloud for any data, anywhere, from the Edge to AI. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises. Learn more at

Comments are closed.