AiThority Interview Series With Rajat Mukherjee, Co-Founder and CTO at Aiqudo
The marketing hype wave will wear off a bit, but the world will indeed be impacted by the advances in deep learning and the more pervasive application of technology to automate, recommend, and reduce human error.
Know My Company
Tell us about your interaction with AI and Voice Automation technologies.
Voice is the logical progression for search, and this paradigm shift is unfolding as we speak. Voice recognition is already viable in many environments. It will become the predominant mode for users to get access to information and get things done. It is far easier to use, and advancements in AI for automatic speech recognition, natural language understanding and intent detection are making these extremely hard problems easier to tackle, at least in constrained environments. Based on my years of experience in the search and mobile industry, I believe that voice solutions will become ubiquitous and increasingly powerful.
How did you start in this space? What galvanized you to start Aiqudo?
I have been exposed to cutting-edge AI from the start of my career at IBM Research. I’ve seen early voice systems and speech recognition in ambient noise at IBM’s cafeteria. I was fortunate to have been connected with the Deep Blue vs. Kasparov chess match, when a computer first beat the human chess champion!! It was historic. I was the scalability guy — delivering the games real-time to millions of users on the web — but I got to see this amazing AI accomplishment unfold up close.
I’ve been in search at Google, Yahoo and Verity for many years and “Task Completion” was always considered a key goal. Voice search accelerates the natural progression from information to action. Google started with the information overload problem. Relevance, speed and access to the long tail of information won the day.
Today’s voice platforms (Google Assistant, Amazon Alexa and others) still start with information. They are evolving to actions, but provide very low utility today. Further, the Skills Model that Alexa has introduced, in my opinion, is not the correct approach for solving the problem. It places the onus on the user to learn and remember specific command syntax. These 2 gaps — utility and usability — were the main focus areas for what we wanted to solve with Aiqudo.
Users should be able to use simple, natural voice commands to get things done — all the way, as we say “Voice to Action!”
What is the state of Voice Assistants in digital transformation journeys?
Today, we’re at Voice 1.0. The tech is interesting, but still super-frustrating to use. I still cannot do the most basic things with voice that I’m already able to accomplish today anywhere on my mobile device when I am on-the-go like make payments, locate my boarding pass at the airline security line, check my friends’ birthdays, check in for a haircut, call using Messenger. Why not? I can demonstrate many commands that don’t work on Alexa and Google Assistant — not because they don’t understand the user’s intent, but because they don’t have the actions built into the platform to satisfy the user’s intent. This is what we refer to as “Actionable Intent”.
With the next wave of voice innovation, this is bound to change. At Aiqudo, we’ve addressed this issue of “Actionable Intent” with proprietary technology that encompasses the rich app ecosystem without requiring APIs, and without asking app developers to do a lot of work and reinvent the wheel.
In the coming years, you’ll see a deeper AI-based phase where voice systems will be everywhere and seem more sentient — become more accurate, natural, conversational, personal and proactive.
How do you see the raging trend of including ‘AI in everything’ impacting modern businesses?
AI/Machine Learning has always been around, but now there’s a lot of buzz around it. In my opinion, the marketing hype wave will wear off a bit, but the world will indeed be impacted by the advances in Deep Learning and the more pervasive application of technology to automate, recommend, and reduce human error. There will also be challenges with respect to employment and training to serve the emerging demand for these skills.
What are the biggest challenges and opportunities for Voice technology companies in dealing with inflating technology prices?
Scale and competition will reduce prices over time. We joke about this at Aiqudo, but what’s true is that you don’t really need to buy an Amazon Echo or a Google Home device to get things done with voice. The ubiquitous device — the smartphone — is already in your pocket or handbag and already has all your trusted apps. It costs an additional 0 dollars! In terms of other industries, automation and reduced need for human involvement in certain tasks will lower costs.
How should young technology professionals train themselves to work better with AI and virtual assistants?
To work in the industry, it will definitely help to build up skills around scalability, cloud infrastructure, mobile technology, search, data science, and, of course, Machine Learning.
To use voice technology (when it is done right), users need to learn nothing. Voice will make things more accessible to billions of people. People who are disabled, illiterate, or in remote locations will be able to get things done with AI and voice that they cannot do today. Voice is just simpler, faster and more natural.
Tell us about the key takeaways from your recent participating at CES 2019. Which other events are you keenly following?
Voice is here, there and everywhere! You saw Las Vegas painted over! Voice in the car is a major focus. We had exciting demonstrations of our software in the automotive use case, where voice is safer, easier, and hands-free. IoT will become pervasive as the current walled gardens come down.
Mobile and automobile conferences will become more relevant in this space, as people realize the need for voice platforms to encompass the mobile app ecosystem — our focus at Aiqudo. Data privacy will continue to garner focus — ala the Apple banner at CES. We agree with “What happens on your phone, stays on your phone.”
How do you consume information on AI/ML and related topics to build your opinion?
From articles, discussions and events. I’d like to review more formal material, but haven’t found the time since the inception of Aiqudo. Luckily, I’m able to learn from my team — we have stellar AI/ML scientists who are doing amazing things and writing about our techniques on our blog. I try to couple my learnings with my own observations and analysis of where things are headed.
What makes understanding AI so hard when it comes to actually deploying them?
That’s a great question. Aiqudo is one of the few AI startup platforms that is already widely deployed. Our Actions platform is live, powering Motorola’s Moto Voice product, in 12 markets and 7 languages. We went live within a year of our existence — a huge challenge and a major accomplishment for the company. I’m so proud of our team. No company I know of has launched their first product/platform globally in this fashion — it is almost always incremental, starting in one market/locale.
We went from English-only to launching in 7 languages in a span of just 4 months!! Our unique command matching technology, which is based on a statistical approach using semiotics, is what made this possible. We are now a global operation, and our user data feeds into our unique voice graph, allowing our algorithms to be constantly refined.
What is the biggest challenge to digital transformation in 2019?
Digital transformation will accelerate in 2019 and beyond! One challenge is that users are more sensitive about their privacy, and have valid concerns about how their personal data is going to be used by the largest platforms. Our Q Actions system is hyper-personalized, but has been designed with privacy in mind. We do not store personally identifiable information (PII) in the cloud — but privately on the user’s device. This still provides context and personalization without compromising privacy and trust.
How an end-to-end solution with data capturing of online behavior helps a company better compete with the likes of Amazon on Google Voice technologies?
We believe we complement other platforms — we do things that these platforms don’t offer. They will find it hard to scale given the developer-centric and cloud-centric approaches they have focused on. We are an open system. Users are not best served by the walled gardens these platforms are creating — no Google services on Amazon, no Amazon shopping on Google, no Facebook on either, etc.
I am a strong believer in personalization, but our approach to privacy is different. We are instantly personalized on day one without setup and configuration, but without requiring sensitive personal data, e.g. contacts data, to be sent to the cloud.
Which is harder — choosing AI or working with them?
We are using our own proprietary AI technology instead of using generic tools since our system works better for this specific application. It offers us higher quality for command matching and our language-agnostic approach allows us to rapidly support multiple languages, hybrid language commands, and is also more robust with respect to errors in ASR (speech to text). It also gives us more flexibility, that, in turn, allows us to deal with our own deep knowledge, that helps us complete the execution of actions. Existing AI tools are not generic across all domains and do not have the features and characteristics that we needed for our solution.”
How potent is the Human-Machine intelligence for businesses and society? Who owns Machine Learning results?
It is potent, and the balance will be found over time. AI will be disruptive, but that’s true of all technology. See what the Internet did with the democratization of information. See what mobile technology did with the equalization of communications. There will indeed be shifts in employment, automation, careers and education. I’m not nervous that AI will destroy us. It will lift us all up, with some growing pains along the way.
Where do you see AI/Machine Learning and other smart technologies heading beyond 2020?
AI/ML will permeate in the consumer world quickly, as we are seeing, and also in all aspects of decision-making in various fields. Robotics applications will accelerate. Semi-autonomous driving is already here. Fully autonomous driving is a major paradigm shift, but in my opinion, it will take more time due to practical constraints and legal considerations. I think we’ll see some amazing medical applications (especially in diagnostics) in the coming years.
The Good, Bad and Ugly about AI that you have heard or predict —
I’m a technologist and an optimist – I believe in good!
What is your opinion on “Weaponization of AI”? How do you deal with the challenge here?
I’m not into doomsday scenarios, but weaponization is not specific to AI. New missiles initiate new missile defense systems. Hacker-attacks will also drive sophisticated ML-based security systems and algorithms. Where there’s venom, there’s also anti-venom. Like nuclear power, AI can be used for good or bad.
The Crystal Gaze
What AI start-ups and Voice Tech labs are you keenly following?
We track developments coming from the large companies so we can take advantage of some of their tools and technologies; we’d rather not focus our attention and resources on things such as Automatic Speech Recognition (ASR), text-to-speech and translation. These technologies require massive amounts of training data that only the larger platforms have resources for. Smaller independent companies will have ongoing challenges in these areas.
What technologies within AI and Voice are you interested in?
We are always looking for new techniques that can help users, such as biometrics for authentication, voice ID, wake word detection, etc. We don’t need to become experts on these technologies, as there are existing solutions and partners that we can leverage.
As a tech leader, which industries do you think would be the fastest in adopting AI/ML with smooth efficiency? What are the new emerging markets for AI technology markets?
Health, medicine, finance, and manufacturing will benefit from AI and ML. I think education and HR will also see interesting applications.
What’s your smartest work-related shortcut or productivity hack?
In the Bay Area, we’re often stuck in traffic. I can just say “join my meeting with the camera off” to join Hangout meetings that I am running late for, or “call Runa” to call my mom on Facebook Messenger — using voice for a seamless, safe and hands-free way to instantly get things done in the car.
Tag the one person in the industry whose answers to these questions you would love to read:
Thank you, Rajat! That was fun and hope to see you back on AiThority soon.
Rajat Mukherjee is Co-Founder and CTO at Aiqudo, which uses its proprietary AI and mobile technologies to connect voice computing to the mobile app ecosystem. His specialties include Search, Internet, E-commerce, Marketing, Parallel Computing and Scalability.