Francesco d’Angela, Service Designer di @HintoGroup- “Oltre la Frontiera Crea...
Owl and The Hummingbird - Ontology and SEO
1. THE OWL AND
Ontology & SEO
The Hummingbird
Dawn Anderson
2. How Can SEO Be Dead?
•For Obsessive Compulsive Link Building Disorder –it may be BUT..
•It was NEVER just about the links
•It was ALWAYS about ‘ontology’
•It’s ALWAYS been about library science, lexicons, relationships
•We still need to ‘search engine optimize’ sites for this –thinking like a machine (SEO)
4. Sergey’s Studies
PICTURE SOURCE: The Anatomy of a Large-Scale HypertextualWeb Search Engine
(Brin/ Page –Stanford EDU (http://infolab.stanford.edu/~backrub/google.html)
Lexicon –14 million words in 1998 –if straightforward winner??
Added last in photo- finish
If further info is needed (no clear winner??)
Red and green parts NOT Sergey’s work
5. “Incomputer scienceandinformation science, anontologyformally representsknowledgeas a hierarchy of concepts within adomain, using a shared vocabulary to denote the types, properties and interrelationships of those concepts.
Ontologies are the structural frameworks for organizing information and are used inartificial intelligence, theSemantic Web,systems engineering,software engineering,biomedicalinformatics,library science,enterprise bookmarking, andinformation architectureas a form ofknowledge representationabout the world or some part of it. The creation of domain ontologies is also fundamental
to the definition and use of anenterprise architecture framework. ”
Ontology Definition (Information
Science):
6. Meanwhile….. In Semantic Web
•People have been busy
•The W3C Working Group
•Why is it taking so long -decades? (it’s complicated –using RDF, OWL, XML, SKOS)
•The AAA Principle
•Always about relationships between
things (entities)
•Machines / humans both understanding the web
•Formal Ontologies are key to this
•Web of Data was always vision
•The ‘Network Effect’ has not yet happened
7. What Is OWL?
•Web Ontology Language
•Why Not W.O.L. –Why OWL?
•“A Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things.” (source: W3C.org)
•TheW3Cchartered the OWL Working Group as part of theSemantic WebActivity in September 2007
8. Meanwhile….. In SEO
•Google Penguin & Panda cause chaos
•Hummingbird slips quietly in
•Link removal becomes the new link building
•‘Toxic’ links are removed or
disavowed
•What to do next???
•A lot of titles get changed on Linkedin
•Some even leave SEO
•SEO budgets get pulled / reduced ???
9. What About The Hummingbird?
•A complete rewrite
•Beginning to connect relationships between things (Semantics) –step towards semantic web (lexical
onomies–e.g. synonyms)
•Is it a coincidence Penguin and Hummingbird arrived within a short period of each other?
•It didn’t just appear –a lot of work has been going on teaching machines via lexical-semantic ontology learning / natural language processing
–Did Google’s Lexicon get better? Did links matter
less because the Lexicon word catalogue got ‘grammar’?? (natural language)
10. Meanwhile….. In Content Marketing
•Copywriters go crazy with WordpressWSYWIG editors
….. Everywhere
•Fabulous content gets released….
…… Everywhere
•With great PR headlines / titles ….
…….. Everywhere
•Written for humans …. Not machines (with little or NO SEO)
•We begin to drown in a sea of content
•And everybody’s traffic ends up in their blog ;)
•Demand for CRO goes through the roof ;)
11. Meanwhile….. Everywhere
We end up with ‘fuzzy ontologies’
????????????????????????? “A fuzzy ontology is one of vagueness, …. A domain or knowledge representation which is unclear and imprecise in nature as to what it relates to
This exists and ontologists/ information scientists work on ways to measure ‘fuzziness’ with measurements of logic
e.g. 0.6 chance it might mean this
DON’T BE FUZZY
12. Don’t Be Fuzzy –There Is Another Way
You can write for humans and machines too
Remember Google’s origins in data mining to organise books
(library science)
MAKE LIKE A LIBRARY
16. Taxonomies
Animal
Mammal
Canine
Feline
Human
Reptile
“All About Categories and Subcategories – broad to narrow”
Much more powerful –clear order for crawlers (and people too)
17. Relationship Ontologies (Knowledge Domain)
Shakespeare
Anne Hathaway
King Lear
Macbeth
UK
England
Scotland
Stratford
Married
Is In
Lived In
Set In
Wrote
Part Of
Part Of
“All About Relationships”
structure AND cross relationships
18. Avoid Fuzz -When Writing For Web
•Write for people AND machines
•Googlebotcan’t read infographics –put words with them
•Don’t dilute your domain ‘theme’ or internal anchor cloud - Don’t link too many ‘irrelevant’ posts to ‘irrelevant’ posts
•Talk about what you do –Obvious (you’d be surprised)
•‘Related content’ CAN be your commercial pages
•Check content keywords in GWT (surprised?)
•Avoid generics –use a category name in blog post structures (NOT CATEGORY)
•When engaging -Build a subject domain ‘Lexicon’ and weave it in -Use lexical ‘onomies-synonyms / hyponyms, meronyms, verbs, holonyms, antonyms, associated keywords (Look in GWT content keywords, thesauri, SERPs) -Remember Hummingbird treats synonyms the same so avoid stuffing
19. Avoid Fuzz -When Writing For Web
•Always get primary terms in somewhere
•Use relevant lexicon words in H1’s, H2’s, H3’s, image alt tags, image file names
•Get primary keywords and sectional keywords in URLs and titles
•Avoid PR headline type titles in your meta title and H1 (riddled with stop words)
•Less is More –It’s all in the hints, clues, site structure (not stuffing or spinning) Connect ‘relationships’ where possible -Use contextual internal linking as long as it’s RELEVANT
•Don’t use nonsense / generic / irrelevant post tags (you’d be surprised)
•Build taxonomical cluster menus where possible (very powerful)
•Build sectional themes in categories (keep it very narrow)
20. When Developing For Web
•Infinite loops (even in small Wordpresssites) (each churn leaks relevance)
•Thin ‘panda vulnerable’ irrelevant content
•Incorrectly implemented canonicalization (pages NOT the same)
•301 redirects to irrelevant pages
•Dilution through poor use of URL ‘parameters and faceted navigations
•Bulk ‘relevance’ together in sections –connect horizontally and vertically - Use ‘flattish’ silos for strong site sections combined with cross module internal contextual linking
•Build primary keyword presence through boilerplates
•Keep things moving –‘action’ in commercial pages (something that changes that is HIGHLY relevant to the sectional theme)
•Products?? What products? Avoid Generics
•All roads eventually lead to commercial targets –most internally linked to
•Use breadcrumbs & mega menus but avoid ‘jumble sales’ –look at conditional / highly related sectional menus (e.g. Widget Logic in Wordpress)
•Name and categorise XML sitemaps, add categories as sites in GWT
21. Bring It On…..
It’s easy to drop the ‘relevance’ ball at the first base through SEO neglect in pursuit of links / content marketing
Exhaust all possibilities via lexical relevance and internal links first then move up
A page beats a page, not a site beats a site
A more ‘relevant’ page will still beat you –even without external links
Because…. Where is the ontology??
22. Dumb Machines
•Disambiguation -A tomato is a fruit
•So a tomato goes in a fruit salad??
Duh
•Machines are still a bit stupid
27. Don’t Neglect SEO
Take it for what it was always meant to be – relationships, domain ontology, library science
-Winning on relevance first then getting the final
votes in a photo finish
-Googlebotis still your primary persona
(unless you don’t want organic traffic ;))
-Everything affects everything in your site’s
‘world’ representation (every word, every internal
link, every developer file upload)
28. REMEMBER THIS
“You can have the best dress in the world….
But if you’re in a dark room with the lights turned off,
no-one will see it…”
SEO still counts
“If you build it RIGHT… they will come”
30. Talk In Triples (RDF)
SUBJECT
PREDICATE
OBJECT
Shakespeare
Wrote
Hamlet
England
IsPart Of
UK
Brad Pitt
Is Married To
Angelina Jolie
31. Use Semantic Relationships (Lexical Onomies)
In words and taxonomies (linked and unlinked)
FUNCTIONALITY
RELATIONSHIP
CONCEPT
EXAMPLES
Describing relationships
Synonomy
Similarities
“buy” and “purchase”, “big” and “large”
Describing relationships
Antonomy
Differences (opposites)
“wet” and “dry”
Describing relationships
Hyponomy
Specialization
“Red is a colour”
Describing relationships
Meronymy
Part / Whole
“Finger is a meronymof hand”
Describing relationships
Holonymy
Whole / Part
“Hand is a holonymof finger”
32. Avoid Fuzz -When Writing For Web
•Write for people AND machines
•Googlebotcan’t read infographics –put words with them
•Don’t dilute your domain ‘theme’ or internal anchor cloud -Don’t link too many ‘irrelevant’ posts to ‘irrelevant’ posts
•Talk about what you do –Obvious (you’d be surprised)
•‘Related content’ CAN be your commercial pages
•Check content keywords in GWT (surprised?)
•Avoid generics –use a category name in blog post structures (NOT CATEGORY)
•When engaging -Build a subject domain ‘Lexicon’ and weave it in -Use lexical ‘onomies- synonyms / hyponyms, meronyms, verbs, holonyms, antonyms, associated keywords (Look in GWT content keywords, thesauri, SERPs) -Remember Hummingbird treats synonyms the same so avoid stuffing
•Always get primary terms in somewhere
•Use relevant lexicon words in H1’s, H2’s, H3’s, image alt tags, image file names
•Get primary keywords and sectional keywords in URLs and titles
•Avoid PR headline type titles in your meta title and H1 (riddled with stop words)
•Less is More –It’s all in the hints, clues, site structure (not stuffing or spinning) Connect ‘relationships’ where possible -Use contextual internal linking as long as it’s RELEVANT
•Don’t use nonsense / generic / irrelevant post tags (you’d be surprised)
•Build taxonomical cluster menus where possible (very powerful)
•Build sectional themes in categories (keep it very narrow)
33. Avoid Fuzz –When Developing For Web
•Infinite loops (even in small Wordpresssites) (each churn leaks relevance)
•Thin ‘panda vulnerable’ irrelevant content
•Incorrectly implemented canonicalization (pages NOT the same)
•301 redirects to irrelevant pages
•Dilution through poor use of URL ‘parameters and faceted navigations
•Bulk ‘relevance’ together in sections –connect horizontally and vertically - Use ‘flattish’ silos for strong site sections combined with cross module internal contextual linking
•Build primary keyword presence through boilerplates
•Keep things moving –‘action’ in commercial pages (something that changes that is HIGHLY relevant to the sectional theme)
•Products?? What products? Avoid Generics
•All roads eventually lead to commercial targets –most internally linked to
•Use breadcrumbs & mega menus but avoid ‘jumble sales’ –look at conditional / highly related sectional menus (e.g. Widget Logic in Wordpress)
•Name and categorise XML sitemaps, add categories as sites in GWT
34. Further Reading
1.Semantic Web For The Working Ontologist–Dean Allemang/ Jim Hendler
2.Studies On The Semantic Web –Perspectives On Ontology Learning –JennsLehmann / JohnannaVolker
3.OWL: Representing Information Using The Web Ontology Language –Lee W Lacy
4.Semantic Web Programming –Hebeler, Fisher, Blace, Perez-Lopez
5.Programming The Semantic Web –Segaran, Evans & Taylor
35. Further Reading
1.http://wortschatz.uni- leipzig.de/~cbiemann/pub/2005/OntoML05proceedings. pdf
2.http://infolab.stanford.edu/~backrub/google.html
3.http://searchengineland.com/killer-seo-string-entity- optimization-171094
4.http://www.seobythesea.com/category/fact-extraction/
5.http://ilpubs.stanford.edu:8090/421/1/1999-65.pdf
6.Fuzzy Ontologies - http://link.springer.com/chapter/10.1007%2F978-3-540- 77581-2_10#page-1