For the first time in the history of software, businesses can build applications that are truly intelligent.
And the best part?
Tapping into this intelligence is no different from accessing APIs, thanks to several AI platforms leading the change. If you’re a business, you can’t ignore the possibilities these services offer.
During 2010-20, we officially entered the era of Artificial Intelligence (AI). Most folks are looking around and saying, “Huh, did I miss something?” because this looks nothing close to the hype the AI “evangelists” and popular culture have been spreading. There are no giant metal squid armies to annihilate and no mind games to be played with human-like intelligence.
However, as technology always does, AI has quietly crept into pretty much everything around us. And while this magical-looking tech is arguably unintelligent by itself–at least in a human sense–its applications, when done correctly, do produce almost magical results. AI may not have anything life-altering for the average person (just everyday improvements, big or small); for businesses, it has unlimited potential for creativity, speed, growth, and much more.
Technologies such as facial recognition, object recognition, speech recognition, etc., promise vastly superior customer experience and an almost unfair edge over the competition. No wonder every business is either setting up an AI division or thinking along these lines. Thankfully, there’s no need to hire an army of PhDs; Artificial Intelligence (or Machine Learning, to be more exact) has matured to the point that several major–as well as niche–players are offering AI services as good ol’ REST APIs.
For a few dollars in total, and in some cases for free, you can evaluate these services and see if they provide a solid value proposition for your business lifecycles.
Yes, once you’ve finalized a service and start using it in production, the monthly bill is what you should keep a strict eye on. AI/ML is essentially about churning massive amounts of data over a distributed set of very powerful servers (mostly use processor chips found in graphics cards); naturally, working with large data and large compute will cost more.
Please, don’t get me wrong.
My intention is neither to diss these companies nor to discourage anyone from dipping their toes in the AI waters. These companies have their pricing plans listed in detail (or provided on request); beyond this, they can’t do much. So, when usage goes unmonitored (or worse, an automated system has a bug that causes, say, server RAM to fill up, and it keeps getting replicated in every new machine created to handle this “load” ), the onus–much like for your electricity bill–falls on you.
Oh, that’s rather simple. The large majority of service providers have an alert system where you can specify an amount that works as the alert threshold (say, $500/month). So, as soon as your usage for a month exceeds $500, the system will send emails, text messages, and what not to every person mentioned in the notify section. Such panic will be hard to miss. 😄 So, what’s the lesson for the day? This financial alert system is one of the first things you should find out about; it should also be one of the first, maybe even the first thing you set up and test. Trust me; you’ll thank me for the advice one day.
All right, enough, chatter! On with the list of AI platforms that I found impressive and what they have to offer.
When it comes to AI, Google is the first name that naturally comes to mind.
And the second name?
Well, at least I can’t think of any! 😂 Google dominates the mindshare when it comes to AI conversations, and for a good reason. Over the years, the company has poured perhaps billions of dollars into AI research and talent. Several of its ambitious AI projects are well-known, and a peek into its latest works sends shivers down the spine:
Because of this deep expertise, Google has some of the highest quality APIs to offer in AI/ML. Let’s look at some of their main offerings.
Text Analysis (Natural Language Processing)
Some of the biggest leaps in AI have been understanding and working with natural languages, whether written or spoken. The Text Analysis API by Google is incredibly powerful, offering features such as:
Syntax analysis (analyze a given text and identify key parts)
Entity analysis (find invoices data in unstructured documents, for example)
Sentiment analysis (identify mood, intention, etc., from written or spoken word)
Multilingual (works with many languages)
So, if you’re itching to find out your customer sentiment from their support chats, try it out right away!
Google has a dedicated prediction service for it if you have your own models and want to generate predictions on new data. It’s even possible to add custom code in case you’re going for something non-standard or experimental. The Prediction service is part of a comprehensive offering called the AI Platform, which we will discuss next.
Those who work with data and AI know just how cumbersome and time-consuming each step of the process can be. To solve these pains, Google offers an end-to-end, comprehensive platform called the AI Platform. It’s a fully managed service for data science and ML and aims to make the operational side of ML and data wrangling as smooth as possible.
So, if you have a non-trivial ML setup and are tired of the hiccups and waits, Google’s AI Platform might be worth a look.
It’d be asking for too much to describe every Google AI/ML service, so those interested can head over to the official docs. There’s a lot more serious, unexplored, jaw-dropping stuff there!
If you take even a little interest in the AI space, you’d have noted the emergence of GPT-3. It’s an advanced ML model for working with natural languages and had (has?) everybody frightened that doomsday was finally here. The force behind GPT-3 is OpenAI, an organization set up to nurture research and collaboration in the AI space — all in the open, which is a rarity in today’s world.
The company was mostly popularized by Elon Musk, one of the founders, when it received massive media attention for its AI research. One example was the game-playing AI that played with and destroyed professional DOTA 2 players at the highest level:
As of writing, Elon Musk is no more involved, and OpenAI isn’t exactly “open” as per its founding principles. But that’s a different discussion, and you can find plenty of material on it.
For us, the bottom line is that OpenAI is doing some seriously groundbreaking work in AI, especially when it comes to text processing, video/image processing, etc. They offer several AI services as APIs, and I’m sure it’s easy to see a strong use case for each of these:
Semantic search: Allows searching on free-form text data, such as documents, based on a query provided in a natural language. So, if you have a digitized library of all customer support chats, you can ask things like “show me a list of chats where customers were very angry because of late resolution”. This wasn’t an official example, but I wanted to make clear the possibilities! 😁
Chatbots: Most chatbots today are nothing but huge baskets full of regret. The business that decided to deploy them regrets later, the developer who created the bot regrets this useless creation, the customer browsing the site regrets interacting with the bot . . . you get the idea. By contrast, OpenAI’s chat capabilities are far superior, especially when it comes to small talk, unexpected turns of conversation, indirect intention, etc. Sure, it’s not perfect, but it raises the bar high enough to make chatbots go from obnoxious/dumb to amusing.
Customer service: If you were worried you’d have to combine the above two services somehow to create a viable customer service experience, OpenAI has already done that. There’s a dedicated service for customer service that has capabilities of search, recommendations, etc.
Text generation: Pretty much like the GPT-3 technology we discussed a while ago, OpenAI offers text-generation capabilities via API. The result is real, intelligent text about pretty much anything (even abstract and weird stuff) that you can use in various creative ways!
Comprehension: This service takes a given text and produces a summary of it. Yes, in its own words! The amount of time this can save and the scope for something this useful is immense. Email fatigue is a good use case in my opinion: just have the AI summarize messages so you can clear your inbox in 10 minutes instead of three hours!
Other tools: OpenAI also has a few other tools/services that come in handy during real-world usage. For instance, it’s possible to convert semantic search results into a spreadsheet for easy analysis; then there’s a service to translate text from one language to another (a pretty common need); and so on.
While OpenAI made a giant splash in the AI world recently, access to its APIs is not easy. You have to apply to join a waitlist; who gets approved, when, and how — these also remain a mystery. Finally, don’t forget that while these technologies are extremely powerful, they are not fully mature. Hence the label “beta” on their entire gamut of services. Still, I’d say it’s worth applying and trying it out in a pilot project.
When it comes to cloud offerings, Microsoft is said to be a distant third (after AWS and Google, that is). But that doesn’t mean the business is in trouble; it has its own special strategy (migrating existing Windows businesses) and is running its own race. While the name Azure is well known, what’s not is that Azure also has a robust set of offerings when it comes to AI-related services. Say hello to Azure Cognitive Services!
And in case you thought Microsoft has been doing nada in the AI space, watch this:
Azure Cognitive Services is a full-fledged AI offering that has pretty much everything you need to build intelligent, powerful applications. In fact, most of their APIs have interesting and more specialized use-cases, which, in my opinion, gives them an edge. Here’s a quick summary of what major APIs they have and what their capabilities are:
Language: These APIs are built around what’s called Natural Language Processing in computer science. In simpler terms, it’s about extracting meaning from, generating, and working with human languages (whether spoken or written). Some interesting capabilities are conversational QnA maker (imagine the possibilities in training/education/hiring!), infusing conversational intelligence into IoT and other devices, sentiment analysis and other metadata about a given text, translation (60+ languages as of writing), and more.
Speech: These APIs provide apps the capabilities of working with human speech. Key offerings include speech to text conversion, text to speech conversion, speech translation, and speech recognition.
Vision: Computer Vision has been a hot topic, and though far from perfect, it is capable enough in scenarios where some margin for error exists. The Vision APIs offered include capabilities such as image and video analysis, object recognition (in image and videos), face detection, video indexer (generating metadata from video), and more.
Decision: This is a set of general-purpose APIs that help in either better decision making or improving the process you follow for ML-based decisions. Capabilities offered in this gamut are anomaly detection (extremely useful for data scientists), content moderation, personalization service (helps you create intelligent, personalized interactions for your app users), and more.
The Microsoft of today is very different, with a clear vision and focus on the cloud, services, and integrated solutions. If you’re running a Windows-based operation, whether on-premise or cloud, integrating the Azure cognitive APIs into your products makes even more sense.
AWS AI Services
When talking about cloud-based services and infrastructure, it’s impossible not to mention Amazon Web Services (AWS). I couldn’t find a highly credible source, so I can’t link it, but apparently, AWS alone has about 33% of the cloud market share. And as a developer, I can vouch for the powerful pull the platform has for all sorts and sizes of software architects, CTOs, developers, business owners, etc.
If it’s a new SaaS product, people want to host on AWS right from the start; and if someone is having scaling or stability issues, they want to move it to AWS.
I’m not saying AWS is the absolute best choice for cloud infrastructure, but its range of services and low-price strategy is hard to beat. The point being, if baking AI/ML capabilities into your (new or existing) apps is on your list, you can never go wrong with AWS’s AI Services.
Here’s their elevator pitch:
AWS offers several powerful, feature-rich services when it comes to AI/ML. Let’s have a quick look at them:
Polly: Text-to-speech is a much-needed capability these days, especially because it allows businesses to create truly “alive”, intelligent apps that can also converse in a human-like, believable voice. Amazon Polly does just that. While the output isn’t exactly the stuff of dreams (listen to the official samples here and here), it’s pretty good for most use cases.
Transcribe: This service is the reverse of Polly, turning speech into text. I can personally testify to its effectiveness, as I used Transcribe in one of the projects to read call center recordings and produce a transcription. The output was extremely accurate (again, I don’t have stats, but I’d say it had above 95% accuracy), and it was able to effortlessly pick up different accents even with some background noise. Plus, the amount of metadata it generated was staggering.
Rekognition: Rekognition is Amazon’s service for computer vision (for images and videos). Besides the standard stuff like facial recognition, object detection, labeling, etc., it also has interesting capabilities such as content moderation (controlling what your kids are watching on their devices, for example), celebrity recognition, equipment recognition (for worker safety and compliance), and more.
Fraud Detector: Fraud is a tar pit costing businesses much money and effort every day. This service provides help by offering fraud detection capabilities on new account creation, guest checkout, online payment, abuse of loyalty programs, etc. Clearly, this service would be very useful to the e-commerce ecosystem.
Lex: If chatbots are your love, but you’re tired of the boring, dumb chatbots commonly found everywhere, Lex is the thing to explore. It has all capabilities a modern chatbot needs, and since it’s a managed service, you don’t have to worry about running servers.
Kendra: Kendra is a document search service, except that search queries are in human language. The service apparently comes with deep “expertise” in a few industries, which means if your data happens to be from one of these industries, the search can be fine-tuned for more accuracy.
A few more services AWS has listed, but if I try to cover them all, I’ll run out of paper and ink! 😁 Besides, if I know one thing about AWS, it’s that it follows Hubble’s Law, resulting in an ever-expanding universe. By the time you’re reading this article, their number of AI services might have doubled or even become ten times! So, if you’re interested, I encourage you to visit the official page and spend some time exploring the services, capabilities, cost, etc.
Since AWS has the highest market share, chances are you’re already hosted on AWS. Or maybe you’re considering shifting your infrastructure to AWS? If so, choosing AWS AI Services will allow your apps to work with other AWS services (think S3, EC2, SNS, etc.) seamlessly and reliably. Just talk to folks who’d had to maintain apps split across infrastructures and you’ll find yourself convinced for life. 😝
ParallelDots is admittedly nowhere close in popularity to the companies in this list so far. However, they’re a rare find and I think they deserve more visibility.
Being primarily an AI company, they create highly useful tools and industry-specific solutions. But perhaps most importantly, they seem to believe in quality over quantity; in their products menu, there are only four items (at least as of now), and one of them stood out for me because it was generic and highly accurate. And the service we’re talking about is their text analysis APIs.
If you visit the link above and scroll down a little, you’ll find a live playground of sorts, where you can enter any text and see the AI’s analysis capabilities at the click of a button.
The text you see in the screenshot is the default they’ve set, by the way. Once you hit the green Analyze button, the analysis of the text as per various categories appears below (the categories are the buttons).
So, how good is the API? I thought of doing some testing of my own, so I fed it something not so straightforward — a piece of prose taken from one of the modern literature classics (for those curious, the book is On the Road by Jack Kerouac, written in 1957). Let’s have a read at the text ourselves first:
The only people for me are the mad ones, the ones who are mad to live, mad to talk, mad to be saved, desirous of everything at the same time, the ones who never yawn or say a commonplace thing, but burn, burn, burn like fabulous yellow roman candles exploding like spiders across the stars.
What do you think about it? What is it trying to convey? What mood do you think it reflects? It’d be good to pause and give a thought to these questions.
And then I pasted it in the text box and hit Analyze. Here’s what turned up:
All in all, pretty good! The piece of prose I selected is pretty challenging and not indicating anything explicitly. However, the sophisticated reader will detect a clear shade of angst/anger that stands out. And that’s also what the API shows as the dominant emotion! However, the text isn’t just plain angry, which is reflected in the API’s confidence score of 30.58%. The near-20% score assigned to “boredom” and “happiness” makes sense as well, as I think these emotions are reflected in the text, though not as dominant ones. Fear, sadness, excitement . . . well, who am I to say these are absent from the text?! Thing is, prose composition and comprehension are highly subjective, so if you disagree with me, it’s fine. 🙂
However, personally, I came out equally impressed with the ParallelDots service as I explored other parts of the above analysis. Sure, it wasn’t right on target all the time, and in a few cases, it was weird too; but as I wrote earlier in this article, 100% accuracy isn’t the goal (and perhaps not even achievable). The goal is a powerful AI that helps us build the kind of applications we’ve only been able to dream about for decades.
So, is the ParallelDots text analysis service for you?
I’d say yes if your needs are limited to text analysis, you want extremely high accuracy, and you’re not fond of the lack of attention you get as a customer when choosing from the biggest names in the game.
Not long ago, IBM’s Watson project was the all-powerful AI that would replace humans once and forever. It was creating movie trailers, beating the best players in Jeopardy, and so on. The end is near; everyone was convinced in their heart of hearts. Fast forward to 2020, and Watson is nowhere in public memory.
But that doesn’t mean it was a flash-in-the-pan project that was later binned. While the AI fell short of its epic potential (or perhaps it was a PR strategy all along?!), Watson lives on as the brain in IBM’s AI offerings for enterprises.
Watson Assistant: This service contains many components geared at improving the customer service experience — both for the customer and the agent! Helping agents find info quickly to resolve queries, understand customer queries and personalize their journey, provide detailed data and metrics, and extract insights from that data — Watson Assistant does it all.
RegTech: IBM RegTech is a heavyweight service aimed at improving compliance and integrating risk management into all layers of an organization’s operations. At a finer level, it also targets key concerns such as payment fraud, financial crime, etc.
Watson Health: Watson Health is a highly specialized AI service for the healthcare industry. Assisting with data-related needs in research, diagnostics imaging, optimizing healthcare plans for cost and quality, etc., are some of its capabilities.
AIOps: AI + Ops = AIOps, says IBM. It’s is a specialized AI service for optimizing IT operations. The IT toolchain and IT operations can get so large and complex that no solution seems workable at an enterprise level. In these scenarios, AIOps helps with early problem detection, solution resilience, improved decision making, and more.
Watson Media: The Watson Media service is specialized for live video streaming at scale. The AI part makes it capable of caption generation, video search, video analytics, etc., on the fly. Since security camera feeds are also a form of live streaming, Watson Media is a good fit there too for threat detection, object recognition, etc.
There are a couple more AI services by IBM, and you can learn about all of them here. IBM is a solid choice for AI services, but remember that their positioning and offerings are optimized towards large to very-large enterprises, so make sure it’s a mutual fit.
Rev.ai is another of those AI companies that believe in developing expertise and doing a few things well. Except that they’ve decided to do only one thing well. Yes, just one! Speech-to-text conversion. Yup, that’s literally all they offer! There’s not even text-to-speech, let alone other categories of AI/ML.
And the result of this hyper, bordering-on-madness obsession? Extreme accuracy, arguably the best among the best in the world. And they offer proof of their AI on this page.
As you can see, their tests show Rev.ai being much more accurate than Google’s speech-to-text. There are many similar comparisons on that page (all compared to and shown beating Google), though sadly there’s no live playground (I wonder why; does it use a lot of computing power? Some other reason?). But that doesn’t mean you can’t evaluate the service; you can create a free account and scrutinize the API as closely as you wish. 🙂
Rev.ai may launch more services in the future, and I scramble to “fix” this article. However, that’s not the case today, so if you want a speech-to-text service with no compromise on accuracy, Rev.ai deserves your attention.
Wit.ai is an AI platform that has advanced capabilities in speech processing as well as text processing. Yes, that sounds like every other NLP and text-analysis/transcription service out there, but there’s more:
Wit.ai is open source. So, there’s nothing stopping you from learning from their tech or hosting the platform on your infrastructure.
Wit.ai isn’t just some code dump lying on GitHub — it’s an actual, running API service as well (in the form of HTTP APIs), which is open for anyone to use.
The API service is free. Yes, totally free! In fact, it’s so free that no pricing plans exist. 🤣🤣
Wit.ai is meant to be extensible. That is, its core purpose is more or less to help you (push you?) into creating, training, testing, and using ML models.
The last point in the list above (about extensibility) needs some unpacking, so here goes: Wit.ai is meant to sit between the user and the device that takes commands and performs actions. The user talks or texts to Wit.ai, which can analyze the message and generate metadata. Once it has figured out what the user wants to do (look for “intent” in the screenshot above) and how they want to do it (the other details in the screenshot: task and datetime), it sends relevant commands and info to the device.
I must emphasize: off the shelf, Wit.ai has very few capabilities. The whole idea is to push you into creating your own ML models, a process that is generally frustrating but is made fun and easy by Wit.ai. And that’s where its strength lies. And oh, in case you decide to use the free API, bear in mind that rate limits exist (roughly 100-250 requests per minute, depending on the endpoint).
Artificial Intelligence (AI), Machine Learning (ML), Neural Networks, data, models, training, prediction . . . none of these are buzzwords anymore. And as happens with any groundbreaking technology, once it has stabilized, AI has been commoditized. The platforms discussed in this article make available the same superpowers to everyone, whether you are a fledgling startup or an industry-munching behemoth.
This forces me to caution stakeholders — AI/ML (APIs or no API) will not, by itself, magically increase your growth (just as “going social” achieves nothing on its own). While exciting, AI has created a level playing field. The rest is up to us. 🙂
Next, explore some of the best AI frameworks to build modern applications.
Google Docs does a great job of keeping things simple. The default page setup works great for most documents, and common formatting options are right on the toolbar. However, when you need to do some advanced formatting, you’ll need to dig a little deeper.