Artificial intelligence is a phrase with a lot of questions behind it. It is difficult to cut through the hype to understand how it will impact your organization so you can plan to utilize the benefits and overcome the challenges. In this session, we will demystify artificial intelligence and how it will impact records management. We will discuss real-life scenarios for artificial intelligence and provide practical advice for how records managers and information officers can prepare for this technology.
Key Issues this Presentation Will Address:
• The lack of understanding about what artificial intelligence is in relation to compliance and records management.
• How to approach a records management strategy that accounts for a future that includes artificial intelligence.
Key Takeaways from this Presentation
• Understand what artificial intelligence means in the context of records management.
• Be prepared to take advantage of the business benefits and overcome the challenges presented by artificial intelligence.
• How to guide the artificial intelligence compliance discussion.
12. Elon Musk
Steve Wozniak
Bill Gates
“I don’t understand why some
people are not concerned”
“ … full artificial intelligence could
spell the end of the human race”
Stephen Hawking
AI is a “demon” that is “potentially
more dangerous than nuclear
weapons” [2]
“The future is scary and very bad
for people.”
48. • Where does your
content live?
• What is the
business purpose
of each content
source?
• How do data
sources relate to
on another?
• How can you
model these
relationships?
• It is an iterative
process
• Build model, look
at results, refine
TRAIN MACHINE
LEARNING
FEDERATED
CONTENT
LOOK AT
RELATIONSHIPS
• Where is AI being
used in your
organization?
• Where could it be
used?
LOOK AT
APPLICATIONS
People often associate automation to a feeling of losing control or visibility
But is this feeling new? Traditional Records Management creates inefficiency, introduce human errors and it’s really costly. E.g. in a study held by Cohasset, they concluded that manual classification costs in average 17 cents per document – for an organization with 25M documents this would represent a cost of more than $4M.
The filing cabinet approach doesn’t work anymore, while it might provide a sense of trust, it doesn’t provide control due to the over-dependency on business users and the lack of visibility of what was not being managed.
Automation provide you great visibility across multiple content sources and provide a consistent way to manage your records and this actually results in much more control that you have in manual processes.
While in the past it was mainly treated as an administrative requirement, RM is now becoming a mission critical part of information governance. As such that the value of records management in operations, and even strategic planning, is becoming more widely recognized.
A lot has changed over the last few of years and the pace of change is getting faster and faster mainly due to the rapid and disruptive changes in information technologies and the globalisation of business practices.
The amount volume of information being generated (and potentially treated as a record) is increasing exponentially.
Every day you, me, our families, colleagues, companies, everyone! is creating 2.5 billion of Gigabytes of data and this number is expected to continue to growth exponentially!
2.5 billion gigabytes of data is equivalent to creating enough data to fill 10 million Blu-Ray disks, which if they were stacked on top of each other, would reach the height of four Eiffel Towers.
Other interesting stats also suggest that every minute around 277,000 tweets are posted, 204,000,000 emails are sent.
To give another perspective of how quick this is moving, just in the last 2 years we have generated almost 90% of the total world data, this is massive as such that with this pace, it is expected that we will have 50 times more information by 2020.
In summary, our content sources are overflowing and the information and data growth is getting to a point where humans on their own are not able to process it.
If we think about 10 or 20 years ago, records were just about electronic and physical documents, and were largely sourced from “on premise” document management systems, file shares and physical locations.
This has changed drastically, in such a way that the definition of what is considered a “Record” was forced to change over the years.
One of the main contributors to this has been the boom of cloud solutions that have exponentially introduced new software, new formats and new sources into Organizations and drastically changed the adoption rate like never before.
We moved from what was a world where consistently using the same tools was key into a world where people are empowered by using the tools that best allow them to get their job done.
Essentially from now on, we almost need to assume that everything is a record and records are everywhere!
The increasing complexity around the compliance requirements, is also posing a big challenge to you as record managers and to us as providers of record management solutions.
Companies are no longer only affected by local laws and regulation anymore, they actually need to consider international laws and regulations too.
This is requiring companies to adopt much complex recordkeeping and information sharing processes.
There is no option to run away from the increasing information volume, the diversity of sources and higher compliance expectations.
Automation is becoming a key factor to help you setting RM for success in this world of change.
'Eventually, I think human extinction will probably occur, and technology will likely play a part in this,' DeepMind's Shane Legg [1] (DeepMind is part of Google)
How can an AI system behave carefully and conservatively in a world populated by unknown unknowns - Tom Dietterich, president of the AAAI [2]
"It [AI] would take off on its own, and re-design itself at an ever increasing rate," – Stephen Hawking [3] (on the consequences of creating something that can match or surpass humans)
“Humans, who are limited by slow biological evolution, couldn't compete, and would be superseded.“– Stephen Hawking [3]
Elon Musk’s hyperloop – LA to San Francisco in 30 minutes (760 mph); Los Angeles to Las Vegas in 20 minutes
Strength – tractor replaced horse-drawn plow that replaced human labor
Speed – Automobile replaced the horse that replaced walking
Sight – telescopes & microscopes enhance human visual capabilities
Hearing – non-electronic amplification (e.g., gramophone) electronic amplification (electric speakers)
In 1956, the computer scientist John McCarthy coined the term "Artificial Intelligence" (AI) to describe the study of intelligence by implementing its essential features on a computer.
A famous example of a technology platform using NLP is Watson, the IBM Super Computer
To demonstrate its abilities, in 2011 IBM challenged the 2 greatest Jeopardy players ever to play against Watson.
This was a particularly interesting exercise because the questions were tricky, ambiguous and the topics were far ranging.
However, Watson was able to defeat the other players by making use of NLP to identify the voice of the presenter and correctly understand what was being asked.
Human and computers, individually, won’t be enough to drive businesses in future.
We shouldn’t look at technologies and automation as a threat or something that is going to take our jobs but as an opportunity. As an opportunity to improve our productivity by freeing up our time from manual, repeatable tasks and involve processing huge amounts of data.
By taking these type of activities out our day, we can concentrate on tasks that require personal expertise, face-to-face interactions, critical and innovative thinking and that are critical for business success.
Is has been proven that when automation is well-planned and implemented, it vastly improves productivity, makes staff happier and brings down costs.
2 automation generations - Robotic Process Automation also known as Rules-based automation and Intelligent Automation.
While the first generation, is mainly about addressing transactional work activities that are primarily repetitive in nature, the second generation, goes beyond and uses emerging technologies that leverage cognitive skills like Humans do.
It’s critical for record managers to understand the potential of these types of automation and the differences between them, so you can leverage them in your own organisation.
The first generation is mainly used to mimic human actions that are repetitive in nature and are based on specific rules, in order to increase speed, accuracy, consistency and scalability. Basically, performing millions of tasks 24/7 in a consistent way.
This type of automation is not new and software tools that make use of it have been widely adopted across industries because:
First of all, they don’t require programming skills, they can be easily configured by the non-technical users;
And secondly, they are not disruptive, therefore they can normally be deployed in organisations without requiring big involvement of the IT department.
Outlook is a good example of a tool that makes use of rules-based automation to manage email.
There users can define a set of rules to automatically apply certain actions to inbound and outbound messages.
Once rules are defined emails are consistently processed without requiring manual intervention and saving users time.
EDRMS systems like RecordPoint provide automated processes that allow the correct capture and categorization of records, that ultimately ensure these are properly retained. Additionally, they are able to capture records from multiple sources and apply the same policies to rule them all!
Records managers can define a set of rules that dictate which content should be considered a record or not. Additionally, rules can also be set to automatically categorize records against the right category in the File Plan. Often the rules are based on the type of source the record is stored and on the metadata of the record itself.
In essence, while the business users add content to the content sources, the rules-based-engine automatically identifies the content, profiles it, takes it through its ‘rules’ & manage that content into the appropriate retention schedule.
This reduces the complexity of these tasks dramatically since records managers don’t need to worry about manually filing each records directly in the EDRMS as well as business users don’t need to worry about records anymore!
And why is this an important capability to have? A “must have” business requirement?
Experience tells us that relying on users, to declare and categorise records, involves high risks (e.g. misclassification) and limits scalability as it assumes that everyone has a good understanding of the records’ polices.
The automation of the disposition process allows the correct and consistent disposition of records whether it is to destroy, keep permanently or transfer to the national archives or another agency.
Some systems offer predefined wizards and others allow fully customisable workflows.
On EDRMS like RecordPoint as the content reaches their disposition due date the system flags or notifies the RM that there is content to be disposed, from there the RM can start the disposal Workflow. This generally requires sending the identified records batch for approval and depending on approval / rejection the system allows the RM to proceed on the disposition of the records or requires further action.
Why is this important? Organisations typically over-retain records.
But disposing per se is not sufficient, the correctly disposition of Records is critical to organisations as they have legal, regulatory, compliance and security obligations to follow.
That’s why automation of the disposition workflow brings great benefits. It ensures that records that are no longer required are securely disposed, ensures great process consistency, compliance to the policies and provides appropriate audit trails.
This is where it gets interesting, with the growing advances on technologies that incorporate cognitive skills this new generation of automation is gaining more and more momentum and it’s raising the bar.
While this is still an emerging area, it’s time to start embracing it! Advisory companies like KPMG say that it’s imperative for businesses to recognize and understand its long term potential, the roadblocks ahead and to focus resources on mapping out a practical transition to it.
There are a few different technologies that are part of intelligent automation, however today we are going to cover 2 of them: Machine learning and Natural Language Processing.
Before I continue can you please put your hand up if you have already heard about machine learning? And how many of you have heard about natural language processing?
Let’s go ahead and talk a bit more about what they are and how are they being used.
The ML concept was born from pattern recognition and from the theory that computers can learn on their own without being programmed to perform a specific task. In simple terms, solutions using Machine Learning can learn from past data and reach to new decisions as they are exposed to new data. Overtime, the accuracy of the decisions gets better and better.
One of the first well known commercial successes of machine learning was Google. They have proved that it was possible to find information with a computer algorithm that uses machine learning principles.
Since then, there has been many other commercial successes
Amazon recommendations: It uses a ML algorithm that suggests you other books that you might be interested in, based on your past preferences and what other people with similar profile have searched or bought.
Linkedin is another example: It suggests you to connect with people that you probably are related too professionally or personally. Again, based on relationships you have, location, new connections and probabilities.
These are examples where the algorithms have learned from past data without requiring for all the possible scenarios to be programmed. They learn as they go.
Auto-Categorization is being considered so far as one of the best approaches for records categorization. Why? Because it makes use of intelligent policies based not only on the record’s metadata but also on content and context.
Unlike the rule-based categorization, this auto-categorisation is adaptive, evolves and learns over time as more records are categorised. Users don’t need to pre-program all the scenarios since the algorithms are “intelligent” enough to decide as new scenarios occur. Additionally, they also learn from their mistakes. For example, when decisions made on new scenarios are wrong and users correct them, the algorithm incorporates the feedback for future decisions.
Additionally, the logic used on the categorization decisions is auditable for an increased defensibility.
The duplication of records is another big challenge for records managers. With the large amount of electronic formats available users create multiple copies of records. Creating them in MS word, converting them into PDF, printing them, etc. Additionally to that these duplicates are stored across different content sources/systems where Record Managers have little visibility on.
This leads to over retention of the records since it is hard to detect them as duplicates.
It might be hard to define what types of rules would actually identify a duplicate in those conditions but there is were Machine Learning can help. By scanning over multiple content sources and automatically introspecting on their metadata, content and context, machine learning can identify patterns to conclude that certain records are actually just duplicates and can eventually be retained over a shorter period of time.
Unlike what it used to be, nowadays 80% of the organisations information is unstructured and it can’t be processed by traditional tools;
This presents challenges for both the usage and for securing the data.
The lack of awareness of what exactly is contained in text-heavy documents makes it likely for enterprises to experience problems with the increasing number of regulations related with data privacy.
That’s where Natural Language Processing has a huge potential to actually extract meaning from unstructured data.
In simple terms, Natural Language Processing combines a set of linguistic, statistical and machine learning techniques that allow text to be analysed in a way that the machine can extract meaning from the words.
Traditional eDiscovery tools are mainly based on searching key words. Even the more advanced just compare the words being searched, count their frequency and use semantic proximity.
These approaches fail on extracting the meaning of a normal human search query.
By using NLP, Discovery tools would allow humans to interact with them in a more natural away making the discovery of content more simpler, faster and meaningful.
So, for organisations, where discovery of content is key, using NLP can be a huge game changer.
Another example of where NLP can be applied, is the management of non-traditional records like the ones based on voice. For example: videos, phone calls or webinars.
With more traditional RM tools, these types of records can only be searched and classified using their metadata. So no great insights are possible to obtain from them.
However, tools using a combination of NLP technologies and text analytics, are able to introspect the content and capture the main idea from what is being said. From there, records can be properly categorized and retained accordingly.
Unlike what it used to be, nowadays 80% of the organisations information is unstructured and it can’t be processed by traditional tools;
This presents challenges for both the usage and for securing the data.
The lack of awareness of what exactly is contained in text-heavy documents makes it likely for enterprises to experience problems with the increasing number of regulations related with data privacy.
That’s where Natural Language Processing has a huge potential to actually extract meaning from unstructured data.
In simple terms, Natural Language Processing combines a set of linguistic, statistical and machine learning techniques that allow text to be analysed in a way that the machine can extract meaning from the words.
Seamless for the end user
Speed -
Improved accuracy
Focus on non-repetitive activities
Better relationships
Decision support
Machines don't need to take breaks - work 24x7
Additional context to data
Unlike what it used to be, nowadays 80% of the organisations information is unstructured and it can’t be processed by traditional tools;
This presents challenges for both the usage and for securing the data.
The lack of awareness of what exactly is contained in text-heavy documents makes it likely for enterprises to experience problems with the increasing number of regulations related with data privacy.
That’s where Natural Language Processing has a huge potential to actually extract meaning from unstructured data.
In simple terms, Natural Language Processing combines a set of linguistic, statistical and machine learning techniques that allow text to be analysed in a way that the machine can extract meaning from the words.
The amount volume of information being generated (and potentially treated as a record) is increasing exponentially.
Every day you, me, our families, colleagues, companies, everyone! is creating 2.5 billion of Gigabytes of data and this number is expected to continue to growth exponentially!
The amount volume of information being generated (and potentially treated as a record) is increasing exponentially.
Every day you, me, our families, colleagues, companies, everyone! is creating 2.5 billion of Gigabytes of data and this number is expected to continue to growth exponentially!
The amount volume of information being generated (and potentially treated as a record) is increasing exponentially.
Every day you, me, our families, colleagues, companies, everyone! is creating 2.5 billion of Gigabytes of data and this number is expected to continue to growth exponentially!
The amount volume of information being generated (and potentially treated as a record) is increasing exponentially.
Every day you, me, our families, colleagues, companies, everyone! is creating 2.5 billion of Gigabytes of data and this number is expected to continue to growth exponentially!
Has to be a business and IT conversation to get the context right
Focus on scenarios and use cases
Look at the ROI of an AI investment - look at the value add by freeing up people for other tasks
Understand the AI landscape - hopefully we covered this
Understand metrics and how to interpret them to prove value
Think about the value you would create if you didn't have to do the repetitive processes
By leveraging the information you have
Provide the context for users - lexicon, taxonomy, other description and scenarios around data
Think about what context means in your organization - how systems could relate to one another
Get your wiring right - get rid of multiple silos - federated content