NEC Launches Free “FireDucks” Software for Accelerating Data Analysis Using Python
Enables up to 16 times faster data preparation, reducing time and cost of data analysis
NEC Corporation announced the launch of “FireDucks”(1), a free software program designed to accelerate the table data analysis library “pandas,” which is used for analysis with Python—the most widely used programming language in the world. Capable of carrying out the data preparation required for data analysis up to 16 times(2) faster than existing products, this newly developed software significantly shortens the time spent on data analysis and lowers computing costs.
In recent years, it has become easier than ever to collect massive amounts of data, including sales data from point-of-sale (POS) terminals, e-commerce, and data from financial transactions. In order to extract valuable analytical results from such data, there is a growing need for data scientists to analyze it using artificial intelligence (AI) and machine learning (ML).
Recommended: Five Things You Should Do to Ace your Customer Service Strategy with Artificial Intelligence
However, in order to prepare for data analysis, large data sets must first be preprocessed. Data scientists are said to spend approximately 45%(3) of their time preparing data, and this has become a major issue. In addition, the surge in data volume and evolution of AI and ML have led to increased computational complexity. As a result, higher computational costs (e.g., cloud costs) and the consequent rise in power consumption and CO2 emissions have also become problematic.
In view of this, NEC set out to develop FireDucks, a software program designed to accelerate pandas. To develop this software, NEC leveraged the high-performance programming technology and acceleration know-how it has cultivated in its thirty-plus years of experience developing supercomputers.
By making the beta version of FireDucks available to the general public free of charge, NEC hopes to contribute to the reduction of work hours for data scientists to analyze data and the resolution of environmental issues through the conservation of power and lowering of CO2 emissions.
Features
1. Accelerated performanceFireDucks is capable of accelerating software programs created using pandas by up to 16 times and on average by about five times(2). This reduces the overall time data scientists spend working on data analysis by approximately 30%(4).
Recommended: Predictions Series 2022: AiThority Interview with Dr. Jack Zeineh, Co-Founder and CTO at PreciseDx
Parallel utilization of all cores and computation reduction are the primary reasons for this level of acceleration. FireDucks utilizes every core of a multi-core CPU to efficiently process large data sets in parallel. Moreover, rather than executing processes in the same order and range specified in the program, the data sets necessary for producing the results are identified from the overall process in advance, which means processing only needs to be performed for those data sets. This in turn makes it possible to accelerate processing.
2. High compatibility
Another feature of this software is its high compatibility with pandas. While some libraries are able to achieve faster processing speeds than pandas, they require multiple steps, including the rewriting of the program. FireDucks, on the other hand, can be easily applied because only one line of the program must be rewritten to perform analysis and coding just as you would if using pandas.
Actual Results
The following results were obtained when FireDucks was used in actual operations by Toyota Technical Development Corporation(5) (TTDC).
- 60% reduction in time spent on data analysis using an in-house AI framework (Spicy MINT)
- 76% decrease in the operating time of the analysis PC
An interview in which TTDC employees who have used FireDucks spoke with members of the development team to provide feedback on the newly developed software can be viewed on the following website.
Future Plans
By providing the beta version of FireDucks free of charge and enabling data scientists to actually use it, NEC will work to improve its functionality while verifying its effectiveness, with the aim of commercializing it within FY2024.
(1)This software was developed with the support of the New Energy and Industrial Technology Development Organization (NEDO) in Japan
(2)According to NEC test results based on the TPCx-BB benchmark
(3)2020 State of Data Science
(4)Based on calculations performed internally by NEC
(5)About Toyota Technical Development Corporation (TTDC): Focused on constructing optimum environments for product development through comprehensive solutions driven by cutting-edge information and technology.
Recommended: Why Managers Should Train More with AI Devices and Intelligent Virtual Assistants
[To share your insights with us, please write to sghosh@martechseries.com]
Comments are closed.