Optimizing AI Advancements through Streamlined Data Processing across Industries

AIT Featured PostsAutomationBusiness Intelligence

By Pooja Choudhary On Jun 25, 2024

Data is the foundation of the most prominent AI applications. To be precise and effective, AI models must be trained on a wide range of datasets. To leverage the potential of AI, enterprises must establish a data pipeline that entails the extraction of data from a variety of sources, its transformation into a consistent format, and its efficient storage. To optimize AI models for real-world applications, data scientists conduct numerous experiments to refine datasets accordingly. To provide real-time performance, these applications, which range from personalized recommendation systems to voice assistants, necessitate the rapid processing of large data volumes.

I will take you through 6 different domains:

Financial org
Telcos
Utility
Auto Maker
Retail
Public Sector

Financial institutions detect fraud in milliseconds

American Express, which processes over 8 billion transactions annually, trains and deploys LSTM models using accelerated computing to tackle these issues. These models are useful for fraud detection because they can adapt and learn from fresh data and sequentially analyze abnormalities.

Financial institutions struggle to detect fraud due to the large amount of transactional data that needs speedy processing. Training AI models are also problematic due to the lack of labeled fraud data. Fraud detection data volumes are too huge for traditional data science pipelines to accelerate. This slows processing, preventing real-time data analysis and fraud detection.

American Express trains its LSTM models faster using GPU parallel computing. GPUs allow live models to process massive transactional data for real-time fraud detection. To secure customers and merchants, the system functions within two milliseconds, 50x faster than a CPU-based design. American Express increased fraud detection accuracy by 6% in certain segments by merging the accelerated LSTM deep neural network with its existing approaches. Accelerated computing can lower data processing expenses for financial companies. PayPal showed that NVIDIA GPUs may save cloud expenses by 70% for big data processing and AI applications by running Spark3 workloads.

The Telcos simplify complex routing operations

Telecommunications companies create massive amounts of data from network devices, client contacts, invoicing systems, and network performance and maintenance. Managing national networks that handle hundreds of petabytes of data daily involves intricate technician routing for service delivery. Advanced routing engines compute millions of times, considering the weather, technician skills, client requests, and fleet dispersal, to maximize technician dispatch. These operations require careful data preparation and enough computational power.

Read: How to Incorporate Generative AI Into Your Marketing Technology Stack

AT&T, which has one of the nation’s largest field dispatch teams, is improving data-heavy routing operations with NVIDIA cuOpt, which calculates difficult vehicle routing problems using heuristics, metaheuristics, and optimizations. In early experiments, cuOpt delivered routing solutions in 10 seconds, reducing cloud expenses by 90% and allowing personnel to perform more service calls every day. NVIDIA RAPIDS, a package of software libraries that speeds data science and analytics pipelines, accelerates cuOpt, allowing organizations to use local search methods and metaheuristics like Tabu search for continuous route improvement. AT&T is using NVIDIA RAPIDS Accelerator for Apache Spark to improve Spark-based AI and data pipelines. The organization can now train AI models, maintain network quality, reduce customer churn, and detect fraud more efficiently. AT&T is decreasing cloud computing spending for target applications, improving performance, and lowering its carbon footprint with the RAPIDS Accelerator.

Medical researchers condense drug discovery timelines

Medical data and peer-reviewed research publications have exploded as academics use technology to explore the 25,000 genes in the human genome and their effects on diseases. Medical researchers use these publications to restrict their hunt for new medicines. Such a huge and growing body of relevant research makes literature reviews impractical.

Pharma giant AstraZeneca created a Biological Insights Knowledge Graph (BIKG) to help scientists with literature reviews, screen hit rates, target identification, and more. This graph models 10 million to 1 billion complex biological interactions using public and internal datasets and scholarly publications. Data scientists and biological researchers defined criteria and gene features for therapy development gene targeting to narrow down potential genes. A machine learning algorithm searched the BIKG databases for genes with treatable properties listed in the literature. NVIDIA RAPIDS was used to decrease the gene pool from 3,000 to 40 target genes in seconds, a task that previously took months. By using accelerated computers and AI, pharmaceutical companies and researchers may finally leverage the massive medical data sets to produce breakthrough treatments faster and safer, saving lives.

Read AI In Data Analytics: The 10 Best Tools

Two-Thirds of Businesses Blame GenAI for Fraud Surge

Dec 18, 2024

Steering a Course Towards Automation Excellence, With Work Orchestration at the Helm

Dec 18, 2024

AI in Supply Chains: Enhancing Decision-Making and Operational Efficiencies

Dec 18, 2024

Prev Next 1 of 3,114

Utility Firms Build Clean Energy’s Future

Energy sector shifts to carbon-neutral sources are widely promoted. Over the past decade, the cost of capturing renewable resources like solar energy has dropped, making it easier than ever to move toward a clean energy future.

Integrating clean energy from wind farms, solar farms, and household batteries has complicated grid management. Grid management is more data-intensive as energy infrastructure diversifies and two-way power flows are required. New smart grids must handle high-voltage vehicle charging locations. Distribution of stored energy sources and network usage changes must also be managed. Utilidata, a leading grid-edge software business, and NVIDIA developed Karman, a distributed AI platform for the grid edge, employing a bespoke Jetson Orin edge AI module. This special chip and platform in power meters turns them into data gathering and control stations that can handle thousands of data points per second.

Karman processes real-time, high-resolution meter data from the network edge. This lets utility firms analyze system status, estimate usage, and integrate distributed energy resources in seconds. Inference models on edge devices allow network operators to quickly identify line defects to predict outages and do preventative maintenance to improve grid reliability. Karman helps utilities create smart grids using AI and fast data analytics. This permits tailored, localized electricity distribution to accommodate variable demand patterns without major infrastructure modifications, making grid modernization more cost-effective.

Read: 10 AI In Energy Management Trends To Look Out For In 2024

Automakers Make Self-Driving Cars Safer, More Accessible

Automakers want self-driving cars that can identify objects and navigate in real-time. This involves high-speed data processing, including feeding live cameras, lidar, radar, and GPS data into AI models that make road safety navigation decisions. Multiple AI models, preprocessing, and postprocessing make the autonomous driving inference pipeline difficult. These processes were traditionally done by CPUs on the client side. This might cause severe processing speed bottlenecks, which is unacceptable for a safety-critical application.

Electric vehicle manufacturer NIO added NVIDIA Triton Inference Server to its inference pipeline to improve autonomous driving workflows. Inference-serving, open-source NVIDIA Triton uses multiple frameworks. NIO centralized data processing operations to reduce latency by 6x in some essential areas and enhance data throughput by 5x.

Retailers Forecast Demand Better

Walmart’s data science team constructed stronger machine learning algorithms to tackle this massive forecasting task, but the computing environment started to fail and produce erroneous findings. Data scientists had to delete characteristics from algorithms to finish them, the company found. Walmart used NVIDIA GPUs and RAPIDs to improve forecasting. A forecasting algorithm with 350 data variables predicts sales across all product categories for the company. These include sales statistics, promotional activities, and external factors like weather and the Super Bowl that affect demand.

Data processing and analysis are essential for real-time inventory adjustments, customer personalization, and price strategy optimization in retail. Larger retailers with more products have more sophisticated and compute-intensive data processes. Walmart, the world’s largest retailer, used accelerated computing to increase forecasting accuracy for 500 million item-by-store combinations across 4,500 shops.

Walmart improved prediction accuracy from 94% to 97%, eliminated $100 million in fresh produce waste, and reduced stockout and markdown scenarios with advanced algorithms. GPUs ran models 100x faster, finishing projects in four hours that would have taken weeks on a CPU.

Public Sector Prepares for Disasters

Public and corporate companies use immense aerial image data from drones and satellites to predict weather, follow animal movements, and monitor environmental changes. This data helps researchers and planners make better decisions in agriculture, disaster management, and climate change. If it lacks location metadata, this imagery is less useful.

A federal agency collaborating with NVIDIA sought a solution to automatically locate photos without geolocation metadata for search and rescue, natural disaster response, and environmental monitoring. Like finding a needle in a haystack, pinpointing a small location in a bigger aerial photograph without information is difficult. Geolocation algorithms must account for image lighting and time, date, and angle variances. A Python-based program was utilized by an NVIDIA solutions architect to overcome this challenge. CPU processing took over 24 hours initially. GPUs parallelized hundreds of data processes in minutes, compared to a CPU’s few. The application was 1.8 million x faster after switching to CuPy, an open-source GPU-accelerated library, producing results in 67 microseconds.

[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]