Computing power is booming due to the increased amount of data and the advanced machine learning algorithms. This promotes the creation of an intelligent ecosystem driven by cognitive services.
We can give the following cognitive services definition: a set of APIs and toolkits for developing artificial intelligence applications to process unstructured information and generate business value from it.
The main advantage of a cognitive services API is that it allows people to create powerful intelligent apps without being highly qualified software developers.
Microsoft Cognitive Computing
In 2015, Microsoft Corporation released a set of intelligent technologies named the Microsoft Project Oxford. The company developed solutions for performing cognitive tasks such as face detection, speech recognition, categorizing images and language understanding.
Later, the product expanded and was rebranded as Microsoft Cognitive Services. Today, this is a collection of APIs (application program interfaces) and SDKs (software development kits) that is accessible from any kind of app. These intelligent tools work across programming platforms and languages, and help incorporate AI functionality into various applications without much effort and with minimal coding.
As of now, Microsoft is one of the largest providers with a growing package of AI-based cloud capabilities. Cognitive technologies are based on the Microsoft cloud platform Azure, and in order to test them out, a developer must create an Azure account. Free trials are available.
Azure Cognitive Services
Let’s take a closer look at the Azure cognitive services to get an understanding of what they do. Depending on the data type they analyze, technologies are grouped into six categories.
Vision APIs analyze visual content (images and video) identifying objects, recognizing faces and emotions. Vision algorithms allow the implementation of face authentication into apps, and the creation of services that can group faces according to some characteristics or guess the age of a person in a photo.
Vision APIs include:
- Computer Vision API (tags and categorizes images from any content, recognizes human faces and other objects, generates descriptions, identifies printed text)
- Content Moderator (monitors text, images and video with respect to offensive or unwanted material and removes content that is disruptive for your business)
- Custom Vision Service (customizes computer vision models for your personal use cases)
- Face API (detects human faces in an image with attribute features such as age, emotion, gender, pose, smile and facial hair; can check if two faces belong to the same person, or group faces on the basis of their similarity)
- Emotion API (identifies universal emotions in images and reveals people’s reactions and mood in different situations)
- Video Indexer (the service is suitable for all kinds of video content; it allows tracking and identifying faces, extracting audio transcriptions, detecting scenes, interpreting the visual text, analyzing sentiment, searching images and events in a video on the basis of tags)
Speech APIs implement speech processing in apps: they convert speech to text and vice versa, translate text to other languages, and identify speakers. The technology can be used for hands-free tools to dictate text or to read instructions out loud, for instance.
Speech APIs include:
- Translator Speech API (this Azure cognitive services API enables transcribing and translating real-time conversations)
- Bing Speech API (converts audio to text and vice versa, text to speech)
- Speaker Recognition API (identifies the speaker; can be used as means of access control and authentication)
- Custom Speech Service (tailors language and acoustics for your application; helps cope with speech recognition barriers such as dialects and background noise)
Language APIs analyze natural language, check spelling, analyze sentiment and syntax.
Language APIs include:
- Language Understanding Intelligent Service (LUIS) (this Microsoft cognitive services API teaches your app to understand what users mean when they say or type something, i.e. learn which words imply which action)
- Bing Spell Check API (corrects spelling errors, understands homophones)
- Web Language Model API (automates a variety of standard natural language processing tasks, e.g. inserts spaces into a string of words lacking spaces)
- Text Analytics API (detects sentiment, key phrases and topics to understand what users want)
- Translator Text API (conducts real-time machine translation with multiple language support)
- Linguistic Analysis API (simplifies complex texts in order to hone in on the core concept and understand what users mean)
Knowledge APIs analyze data to discover relationships and patterns to complete tasks such as recommendations or query autocompletion.
Knowledge APIs include:
- QnA Maker API (creates your own bot that can answer questions in a natural way)
- Custom Decision Service (understands the context of provided information, ranks the options and makes a decision; is able to optimize and improve based on your feedback and experience)
- Project Knowledge Exploration (interprets natural queries, enables interactive search experiences over structured data, creates interactive autocompletion suggestions)
- Project Academic Knowledge (helps users narrow down search results and find the right academic information on any topic faster)
- Project Entity Linking (Links input text to additional information on the web to eliminate the ambiguity of some words)
Search APIs are integrated with the Bing search engine and include:
- Bing Autosuggest API (suggests options for searches based on what other users have asked for)
- Bing News Search API (searches for news on the web according to the query)
- Bing Web Search API (returns search results from billions of web documents)
- Bing Entity Search API (searches for entities and places such as restaurants and local businesses)
- Bing Image Search API (searches for images relevant to the user’s query)
- Bing Video Search API (searches for videos on the web according to the query)
- Bing Custom Search API (creates tailored search experiences to deliver the results you want)
Developers have a chance to use experimental technologies that are still under development. If they don’t need a market-ready technology, they can adopt the experimental techniques, try them and provide feedback on the new Microsoft cognitive computing services before they are generally available.
Currently, Labs include:
- Project Gesture
- Project Event Tracking
- Project Local Insights
- Project Knowledge Exploration
- Project Academic Knowledge
- Project Entity Linking
Microsoft Advances Some of Its Hosted Artificial Intelligence Algorithms and Updates Terms
In February 2018, the Microsoft team announced enhancements in some of their hosted AI tools. The most significant changes are:
- The Custom Vision Service is moved from free to paid preview for new users
- The Face API provides an opportunity to group up to a million images
- The Bing Entity Search algorithm enables developers to embed Bing search results in any application.
The team also updated cognitive services terms for customer data to promote transparency and give customers more control over privacy. This means that:
- Cognitive services are now under the same terms as other Azure products. According to the terms, customers own their data, and can manage and delete any information. The changed terms concern Computer Vision, Face, Content Moderator, Text Analytics, Bing Speech and Language Understanding services. Microsoft Translator will be updated in May. All new cognitive products will conform to the same standards as Azure services.
- Bing Search Services is the exception. Its data will continue to be treated differently than other customer data.
How to Use Microsoft Cognitive Services
For some services, free trials are available. To get a feel for what they are, do the following:
- Open your Microsoft account if you have one, or create a new Azure account
- Go to the Cognitive Services web page
- Select the necessary API
- Get the API key
- Start using the API
- When your trial period ends, you will be offered the option to purchase the service. The list of prices is given on the corresponding purchase page.
- To purchase, you will need a credit card and a valid mobile number
If developers are familiar with building cloud apps, they will need minimal learning. Those who lack cloud expertise, can integrate cognitive services using Azure Logic Apps and minimal coding. The main challenges concern:
- The connection of cloud-based apps with internal data sources (you’ll need to implement a hybrid network and an integration strategy before you start producing AI apps)
- Availability (Microsoft cognitive services is only available in a group of Azure regions)
- Pricing (it can be confusing as there are several pricing models: billing per the number of service calls per month; billing per hour; fixed price with charges for exceeding the quota)
Creating Microsoft Cognitive Services app has become quite an easy task, making AI more accessible to enterprises. When provided as a service, artificial intelligence eliminates the necessity for organizations to build the costly infrastructure required for machine-learning algorithms. Developers can implement AI features using the same code for Windows, iOS, Android and browser customers.
Many companies worldwide are using cognitive technologies. Uber, for instance, provides platform security with the help of the Face API. It verifies that the driver is the same person listed on the account.
SaM Solutions has deep expertise in working with the Microsoft Azure platform. Our developers have experience implementing AI elements as part of various projects. Moreover, our team took part in the creation of Bing Speech API, Language Understanding Intelligent Service (LUIS), Computer Vision API and Video Indexer.
Artificial intelligence is the future. Contact us and start creating your future right now!