Machine Learning for Any Size of Data, Any Type of Data

Machine Learning for Any Size
of Data, Any Type of Data
See, Hear and Understand the world

Unstructured data accounts for
90% of enterprise data*
*Source: IDC

Confidential & ProprietaryGoogle Cloud Platform 3
Your data spans Text, Speech and Images
> 2 Billion Images/ daily> 2 Million Blog Posts daily
> 400 Million Social Media Posts
20% of Mobile Searches

What can we do with all this data?
Moderate Content
Understand Sentiment
Structured Metadata

Examples of understanding unstructured data

Pre-Trained Machine Learning Models
Cloud
Vision
Cloud
Translate
Cloud
Speech
Fully trained ML models from Google Cloud that allow a general developer to take advantage of
rich machine learning capabilities with simple REST based services.
Stay tuned...

77
Ready to use Machine
Learning models
Use your own data to
train models
Cloud
Vision API
Cloud
Speech API
Cloud
Translate API
Cloud Machine Learning
Develop - Model - Test
Google
BigQuery
Stay
Tuned….
Cloud
Storage
Cloud
Datalab

Cloud Vision API
Insight from images with our powerful
Cloud Vision API

Faces
Faces, facial landmarks, emotions
OCR
Read and extract text, with
support for > 10 languages
Cloud Vision API
Call API from anywhere, with support for embeddable images, and Google Cloud storage
Label
Detect entities from furniture to
transportation
Logos
Identify product logos
Landmarks & Image Properties
Detect landmarks & dominant
color of image
Safe Search
Detect explicit content - adult,
violent, medical and spoof

API Usage: Detect Objects in an Image
Image Detected
Items
Vision API
Create JSON
request with the
image or pointer
to an image
Process
the JSON
response
Call the
REST API1 2 3

Use the Vision API - Python example
# Setup the service request for an embedded image
service_request = service.images().annotate(body={
'requests': [{
'image': {
'content': image_content.decode('UTF-8')
},
'features': [{
'type': 'LABEL_DETECTION',
'maxResults': 1
}]
}]
})
# Process the results
response = service_request.execute()
label = response['responses'][0]['labelAnnotations'][0]['description']

Use Case: Image Content Moderation
Examples
● User manages a large set of images, that are crowd-
sourced.
● Identify potential explicit content on images that are
uploaded.
Enabling Technology
● Powered by Google SafeSearch, detect
inappropriate content from adult to violent content
“As a company I must detect adult content, violent content, spoof images
and medical content to protect my consumers and my brand.”

Use Case: Image Sentiment Analysis
Enabling technology
● Cloud-based API that provides the most advanced
algorithms for face and logo detection.
● Ability to identify emotional state of the
face - joy/sorrow/anger.
● Ability to identify popular product brand logos within
the image.
● Ability to draw the polybox around identified product.
“As a developer, applications I build should be able to detect faces and
emotional facial attributes and detect objects and logos.”

Use Case: Image Metadata
Enabling technology
● Powered by the same technologies under Google
Photos, detect 1000s of everyday objects from
transportation to home interior
● Detect 1000s’ of manmade and natural landmarks
● Ability to draw the polybox around identified entity.
“As a developer, I want to understand the contents of the image from
everyday entities, to logos and landmarks

Use Case: Extract Text
Enabling technology
● Read text from any image containing receipts,
invoices or scanned documents
● Supports variety of languages from English
to Chinese
● Granular text extraction from individual words
to text summary
“As a developer, I want to extract text from receipts, invoices and images

Customer testimonials
We have drones that take thousands
of photos per flight. We find that
Google Cloud Vision API is the best
way to turn those huge number of
photos, automatically produced,
into meaningful insight.
Tomoaki Kobayakawa
General Manager,
Sony - Aerosense Inc.
We did the impossible:
ML without knowing
anything about ML.
David Zuckerman
Head of Developer
Experience, WIX.com
“
”
“
”

Vision API Demo

• Google Cloud Vision API provides the broadest set of vision scenarios
from one single API
1. Label Detection
2. OCR
3. Explicit Content Detection
4. Facial Detection
5. Landmark Detection
6. Logo Detection
• Integrated: Vision API is integrated with other Google Cloud platform products
• Easy to use API: Inline image with JSON Response
• Pay as you go model: Users to pay only for what they use with
no upfront commitments
Cloud Vision API Summary

Cloud Speech API
Speech to text conversion

Recognize Speech
Streaming Recognition
Cloud Speech API
Call API from anywhere, with support for streaming audio, and Google Cloud storage
Transcribe Audio
Transcribe stored audio
Global
Supports > 80 languages

API Usage: Understand Speech - Batch
Stored
Audio Recognized
text
Speech API
Create JSON
request with the
audio file and
language of audio
(default is en_US)
Process
the JSON
response
Call the
REST API1 2 3

API Usage: Understand Speech - Streaming
Streaming
Audio Speech API gRPC
Recognized
Text
gRPC streaming
request with
initial context
Real time
streaming
results while
speaking
Bi-directional:
Streams audio
in while stream
text out
1 2 3

Use the Speech API - Python Example
# Setup the service request for an embedded audio file
with open(speech_file, 'rb') as speech:
speech_content = base64.b64encode(speech.read())
service = get_speech_service()
service_request = service.speech().recognize(
body={
'initialRequest': {
'encoding': 'LINEAR16',
'sampleRate': 16000
},
'audioRequest': {
'content': speech_content.decode('UTF-8')
}
})
response = service_request.execute()
print(json.dumps(response))

Use Cases
● Voice enabling chat / messaging apps: Use voice commands to dictate messages and retrieve information
● Voice controlled games: Player can control the goings on in the game using select voice commands spoken into
a microphone
● Home automation: Monitoring and controlling all various devices in your home by using the sound of your voice,
the web or a variety of other interfaces
● Meeting analytics: Identify words, phrases and patterns that correlate with important customer actions, to drive
business results
● Call-center analytics: Listen to your business' everyday interactions, to improve your customer experience
Referenceable Customers

Speech API Demo

● Global footprint: Recognizing over 80 languages and variants
● Highest quality voice recognition: Neural networks-based to continuously train and improve the API
● Fast: Streaming recognition to return partial recognition results immediately as they become available
rather than waiting for the user to stop speaking
● Accurate: Noisy audio handling to transcribe audio from many environments without requiring
additional noise cancellation on the developer’s side
● Both real-time and buffered audio: You can convert the audio from users dictating to an application’s
microphone, enable command-and-control through voice, or transcribe audio files, among many other
use cases. Multiple audio file formats are supported, including FLAC, AMR, PCMU/u-Law and linear-16.
Cloud Speech API Summary

What’s Next?
Codelabs
g.co/codelabs
For Developers
cloud.google.com/vision/
cloud.google.com/speech/
cloud.google.com/translate/
Stack Overflow
Contact:
@GCPBigData @apoorvsaxena1

Machine Learning for Any Size of Data, Any Type of Data

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Machine Learning for Any Size of Data, Any Type of Data

Similar to Machine Learning for Any Size of Data, Any Type of Data (20)

More from DataWorks Summit/Hadoop Summit

More from DataWorks Summit/Hadoop Summit (20)

Recently uploaded

Recently uploaded (20)

Machine Learning for Any Size of Data, Any Type of Data