SlideShare a Scribd company logo
1 of 29
Download to read offline
Patent Search: An important new test bed for IR

                    J. Tait, M. Lupu1
     H. Berger, G. Roda, M. Dittenbach, A. Pesenhofer2
                 E. Graf, K. van Rijsbergen3

                 1 InformationRetrieval Facility
                        Vienna, Austria
                         2 Matrixware

                        Vienna, Austria
                   3 University of Glasgow

                 Dept. of Computing Science
                        Glasgow, UK


               DIR 2009 / Feb. 2-3, 2009
Patent Search.




  Patent search is a highly specialized form of information search.
  It is characterized by its
      target data
      type of information needs
      legal and economic implications
Target data


  Data for patent retrieval comes mainly from:
      patent databases from patent authorities (EPO, USPTO,
      JPO, SIPO, WIPO, etc.)
      scientific publications
      prior art databases (IP.com)




  A new acronym
  SIPO: State Intellectual Property Office of the Peoples’ Republic of
  China
Target data


  Characteristics of patent documents
      multilingual and ’legalese’
      non uniform formats
      some are OCR’d
      figures, images, chemical formulas, DNA sequences
      include references to patent and non-patent literature




  A new acronym
  NPL: Non-Patent Literature
Information Needs.


  K.H. Atkinson, Towards a more rational patent search paradigm:



  depending on what group is doing the asking, the types of patent
  search requested may include simple patentability, clearance to
  market a product, validity, opposition to a patent being sought by
  another, infringement watch, creating IP landscapes for business
  development or R&D, infringement defense, litigation, prosecution
  support, and creation of portfolios for assignments, investments,
  mergers and acquisitions [ . . . ]
Legal and economic implications.


      patents are legal documents
      patent portfolios are assets for enterprises
      a single patent search can be worth several days of work




  High recall searches
  Missing even a single relevant document can have severe financial
  and economic impact. For example, when a granted patent
  becomes invalidated because of a document omitted at application
  time.
Introduction
                             Patent Search
                     A modern IR test bed
              Promoting take up of research
                                Conclusion




We have characterized the patent search problem by describing its
target data, types of information needs, legal and economic
implications.


Next:
    evaluating IR techniques in the patent domain
         previous initiatives in the area of patent retrieval
         the CLEF-IP and TREC-Chem initiatives
    promoting take-up of research




                                 Tait et al.   Patent Search: An important new test bed for IR
Test collections


  Test collections in Information Retrieval play a pivotal role in the
  evaluation of retrieval models.



  Domain-specific test collections already exist for:
       Web pages
       news stories
       legal documents
       blogs
       genomics
       patents
Pioneering work in patent retrieval.

  Patent retrieval task at the NTCIR Workshop1 since 2001.
         produced test collections primarily targeting Japanese patents
         retrieval tasks
             ad-hoc (goal: find patents on a given topic)
             invalidity search (goal: find patents invalidating a given claim)
             patent classification according to the F-term system



  Two new acronyms
  F-term (abbreviation of File-forming term) is the classification
  system used in Japan as a complement to IPC (International
  Patent Classification)



    1
        http://research.nii.ac.jp/ntcir
Evaluation tracks.




  The IRF has engaged in two pilot evaluation tracks on patent
  retrieval
      CLEF-IP
      www.ir-facility.org/the_irf/clef-ip09-track
      TREC-Chem
      www.ir-facility.org/the_irf/trec_chem.htm
CLEF-Intellectual Property Initiative.

  CLEF-IP
         coordinated by the IRF
         part of the Cross-Language Evaluation Forum2
         will focus on the task of prior art search
         European patents as target data
         automatic extraction of relevance assessments



  Prior art search
  Prior art search consists in identifying all information (including
  NPL) that might be relevant to a patent’s claim of novelty.



    2
        http://www.clef-campaign.org
Prior art search.


  The most common type of patent search. Performed at various
  stages of the patent life-cycle and with different intentions:
      before filing an application (novelty search or patentability
      search) to determine whether the invention fulfills the
      requirements of
           novelty
           inventive step
      before grant - results go into a search report attached to
      patent
      invalidity search: post-grant search used to unveil prior art
      that invalidates a patent’s claims of originality
Target data.



  The CLEF-IP evaluation track will restrict target data to patents.

  Target data:
      comprising 16 years (filing date between 1985 and 2000) of
      EPO patents
      1.9 million patent documents corresponding to 1 million
      patents
      75 GB, in XML format
      documents are in English, German, and French
Automatic extraction of relevance assessments.



  The data resulting from prior art searches is saved in the EPO or
  USPTO databases as:
      citations in patent applications
      citations in search report
      citations in opposition’s legal files

  The CLEF-IP track is going to extract this information (as much
  as possible) automatically in order to form a large set of topics.
Prior art from opposition procedures.




      According to the European patent law, a granted patent may
      be opposed.
      It is often the case that opponent provides new prior art that
      invalidates claim of originality of the invention.
      Patents cited in opposition procedures are very relevant prior
      art documents.
      They are the results of a very thorough invalidity search.
Crowdsourcing extraction of relevance assessments.



         Need to extract citations from documents arising from
         opposition procedures
         These documents are only are available as scanned images3
         Will be using crowdsourcing for extracting these citations.




  A new word from business jargon
  Crowdsourcing.




    3
        at http://www.epoline.org
Relevance and evaluation measures.


  Labels used in search reports:

    label   means that cited document is
      X     relevant when taken alone
      Y     relevant in combination with other documents
      A     relevant but not prejudicial to novelty or inventive step




  How to use these labels for defining new evaluation measures?
Challenges.




  As a result of the CLEF-IP track we expect to obtain new insights
  on:
      how to represent information need given by a patent
      query reformulation
      evaluation metrics for patent retrieval
      using machine translation for improving retrieval effectiveness
TREC Chemistry track.



     Ad-hoc search
     Target data:
         academic papers (Royal Society of Chemistry)
         chemical patent documents (class C in the IPC)
     Will use automatic extraction of citations for relevance
     assessments
     Challenges:
         chemical names and structures
         chemical interactions, relations, transformations, properties
Introduction
                        Patent Search     Pioneering work at NTCIR
                A modern IR test bed      CLEF-IP
         Promoting take up of research    TREC-Chem
                           Conclusion




The IRF is contributing to the creation of new patent test
collections by organizing two tracks within the CLEF and
TREC evaluation campaigns.

In addition to the TREC and CLEF contributions, the IRF,
together with Matrixware, is promoting several initiatives
aimed at facilitating and improving the patent retrieval
process.




                            Tait et al.   Patent Search: An important new test bed for IR
Introduction     The IRF
                              Patent Search     Matrixware
                      A modern IR test bed      Promoting research
               Promoting take up of research    Providing the tools
                                 Conclusion     Current University Projects



Promoting take up of research


  Next:
      presentation of the IRF and Matrixware
      promoting take up of research
          the IRF symposium
          the PaIR workshop
      providing the tools
      funding research in the area of patent retrieval




                                  Tait et al.   Patent Search: An important new test bed for IR
IRF: the Information Retrieval Facility.




    New international not-for-profit
    foundation, based in Vienna,
    Its mission:
        to bridge the gap between the needs of
        the industry and the academic know-how
        to promote and facilitate research in
        large scale information retrieval
        maintain a facility that enables large
        scale information retrieval and in-depth
        data processing
Matrixware.




    Founded 2005 in Vienna
    80 Employees
    > 15 Academic Partners Worldwide
    Implements solutions for access to patent
    information
Promoting research.



  Matrixware and the IRF have engaged in several initiatives aimed
  at promoting research and raising awareness in the area of patent
  retrieval.
      the Information Retrieval Facility Symposium
      an annual symposium held in Vienna to foster knowledge
      exchange between IR experts and IP professionals
      the PaIR workshop
      a workshop on Patent Information retrieval hosted by the
      CIKM conference
Providing the tools.




  Successful IR research conventionally depends on three elements:
    1   the availability of test collections
    2   access to suitable software systems on which to run
        experiments
    3   access to sufficiently powerful hardware


  The IRF, supported by Matrixware, is providing all three of these.
Current University Projects.




      Accessibility of Information (Glasgow)
      Large Scale Logical Retrieval (Glasgow)
      Semantic Analysis of Patent Data (Sheffield and Nijmegen)
      Language Modeling for Patent Retrieval (Umass Amherst)
      OCR for patents (Umass Amherst)
Concluding remarks




     Patent retrieval is an interesting and important open
     challenge for IR researchers.
     The IRF and Matrixware have engaged in several projects
     aimed at promoting research in this area.
Introduction
                               Patent Search     Concluding remarks
                       A modern IR test bed      Invitation
                Promoting take up of research    Closing
                                  Conclusion



Invitation.



  You are invited to:
      join one of the evaluation tracks
           CLEF-IP
           TREC-Chem
      participate in the PaIR workshop
      participate in the Information Retrieval Facility Symposium




                                   Tait et al.   Patent Search: An important new test bed for IR
Thank you for your attention.

More Related Content

What's hot

II-PIC 2017: China: Life after the Patent Tsunami
II-PIC 2017: China: Life after the Patent TsunamiII-PIC 2017: China: Life after the Patent Tsunami
II-PIC 2017: China: Life after the Patent TsunamiDr. Haxel Consult
 
Patent database with one example
Patent database with one examplePatent database with one example
Patent database with one examplePallavi Belkar
 
Patent Process: Filing to Grant
Patent Process: Filing to GrantPatent Process: Filing to Grant
Patent Process: Filing to GrantAshwani Dhingra
 
Patent search analysis and report
Patent search analysis and reportPatent search analysis and report
Patent search analysis and reportYash Patel
 
Freedom to operate: Biosciences innovations and intellectual property manage...
Freedom to operate: Biosciences innovations and intellectual property manage...Freedom to operate: Biosciences innovations and intellectual property manage...
Freedom to operate: Biosciences innovations and intellectual property manage...ILRI
 
Patents 101: How to Do a Patent Search
Patents 101: How to Do a Patent SearchPatents 101: How to Do a Patent Search
Patents 101: How to Do a Patent SearchKristina Gomez
 
II-PIC 2017: Product presentation Lighthouse IP
II-PIC 2017: Product presentation Lighthouse IPII-PIC 2017: Product presentation Lighthouse IP
II-PIC 2017: Product presentation Lighthouse IPDr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
How to do an effective patent search
How to do an effective patent searchHow to do an effective patent search
How to do an effective patent searchBjörn Jürgens
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsdgarijo
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesdgarijo
 
Design phase kick-off event and Ceremony
Design phase kick-off event and CeremonyDesign phase kick-off event and Ceremony
Design phase kick-off event and CeremonyArchiver
 

What's hot (16)

II-PIC 2017: China: Life after the Patent Tsunami
II-PIC 2017: China: Life after the Patent TsunamiII-PIC 2017: China: Life after the Patent Tsunami
II-PIC 2017: China: Life after the Patent Tsunami
 
Patent database with one example
Patent database with one examplePatent database with one example
Patent database with one example
 
Patent Process: Filing to Grant
Patent Process: Filing to GrantPatent Process: Filing to Grant
Patent Process: Filing to Grant
 
Patent database
Patent databasePatent database
Patent database
 
Patent search analysis and report
Patent search analysis and reportPatent search analysis and report
Patent search analysis and report
 
Freedom to operate: Biosciences innovations and intellectual property manage...
Freedom to operate: Biosciences innovations and intellectual property manage...Freedom to operate: Biosciences innovations and intellectual property manage...
Freedom to operate: Biosciences innovations and intellectual property manage...
 
Patent Search
Patent SearchPatent Search
Patent Search
 
Patents 101: How to Do a Patent Search
Patents 101: How to Do a Patent SearchPatents 101: How to Do a Patent Search
Patents 101: How to Do a Patent Search
 
Patent analysis
Patent analysisPatent analysis
Patent analysis
 
II-PIC 2017: Product presentation Lighthouse IP
II-PIC 2017: Product presentation Lighthouse IPII-PIC 2017: Product presentation Lighthouse IP
II-PIC 2017: Product presentation Lighthouse IP
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
How to do an effective patent search
How to do an effective patent searchHow to do an effective patent search
How to do an effective patent search
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflows
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
Patent search
Patent searchPatent search
Patent search
 
Design phase kick-off event and Ceremony
Design phase kick-off event and CeremonyDesign phase kick-off event and Ceremony
Design phase kick-off event and Ceremony
 

Viewers also liked

Pharmaceutical patents in india – compulsory licensing, health emergency & af...
Pharmaceutical patents in india – compulsory licensing, health emergency & af...Pharmaceutical patents in india – compulsory licensing, health emergency & af...
Pharmaceutical patents in india – compulsory licensing, health emergency & af...Rahul Dev
 
Compulsory licensing by surendra
Compulsory licensing by surendraCompulsory licensing by surendra
Compulsory licensing by surendraAnumulaSurendra
 
Introduction to patent search
Introduction to patent searchIntroduction to patent search
Introduction to patent searchPatSnap
 
Intellectual Property Rights
Intellectual Property RightsIntellectual Property Rights
Intellectual Property Rightsharshhanu
 

Viewers also liked (8)

Pharmaceutical patents in india – compulsory licensing, health emergency & af...
Pharmaceutical patents in india – compulsory licensing, health emergency & af...Pharmaceutical patents in india – compulsory licensing, health emergency & af...
Pharmaceutical patents in india – compulsory licensing, health emergency & af...
 
Compulsory licensing by surendra
Compulsory licensing by surendraCompulsory licensing by surendra
Compulsory licensing by surendra
 
CL PPT
CL PPTCL PPT
CL PPT
 
PCT
PCTPCT
PCT
 
Introduction to patent search
Introduction to patent searchIntroduction to patent search
Introduction to patent search
 
Indian patent act
Indian patent actIndian patent act
Indian patent act
 
The patent act
The patent actThe patent act
The patent act
 
Intellectual Property Rights
Intellectual Property RightsIntellectual Property Rights
Intellectual Property Rights
 

Similar to Patent Search: An important new test bed for IR

Intellectual Property Serrvices Outsourcing- India Company Overview
Intellectual Property Serrvices Outsourcing- India Company OverviewIntellectual Property Serrvices Outsourcing- India Company Overview
Intellectual Property Serrvices Outsourcing- India Company OverviewEPatents IP Services
 
Patent search analysis and report
Patent search analysis and reportPatent search analysis and report
Patent search analysis and reportYash Patel
 
Patent search from product specification final
Patent search from product specification finalPatent search from product specification final
Patent search from product specification finalIIITA
 
Methods to improve Freedom to Operate analysis
Methods to improve Freedom to Operate analysisMethods to improve Freedom to Operate analysis
Methods to improve Freedom to Operate analysisDauverC
 
A Survey Of Automated Hierarchical Classification Of Patents
A Survey Of Automated Hierarchical Classification Of PatentsA Survey Of Automated Hierarchical Classification Of Patents
A Survey Of Automated Hierarchical Classification Of PatentsCourtney Esco
 
PatAnalyse Presentation
PatAnalyse PresentationPatAnalyse Presentation
PatAnalyse Presentationzhiv12
 
PatAnalyse presentation
PatAnalyse presentationPatAnalyse presentation
PatAnalyse presentationvictor_zh
 
Process Protection Lieu Final
Process Protection Lieu FinalProcess Protection Lieu Final
Process Protection Lieu FinalFITT
 
Chi ham ip-workshop_databases_demo_chile
Chi ham ip-workshop_databases_demo_chileChi ham ip-workshop_databases_demo_chile
Chi ham ip-workshop_databases_demo_chileFundación COPEC - UC
 
CambridgeIP: Case Studies Of Recent Client Engagements
CambridgeIP: Case Studies Of Recent Client EngagementsCambridgeIP: Case Studies Of Recent Client Engagements
CambridgeIP: Case Studies Of Recent Client EngagementsCambridgeIP Ltd
 
mHealth Israel_ IP Strategy in China_Ehrlich & Fenster
mHealth Israel_ IP Strategy in China_Ehrlich & FenstermHealth Israel_ IP Strategy in China_Ehrlich & Fenster
mHealth Israel_ IP Strategy in China_Ehrlich & FensterLevi Shapiro
 
Data Science Application in Business Portfolio & Risk Management
Data Science Application in Business Portfolio & Risk ManagementData Science Application in Business Portfolio & Risk Management
Data Science Application in Business Portfolio & Risk ManagementData Science Thailand
 
PIIP_Patent search & Analysis_20160829
PIIP_Patent search & Analysis_20160829PIIP_Patent search & Analysis_20160829
PIIP_Patent search & Analysis_20160829YiTien Liao
 
Advancing Global Innovation: The Role of PCT Practice and Strategy
Advancing Global Innovation: The Role of PCT Practice and Strategy Advancing Global Innovation: The Role of PCT Practice and Strategy
Advancing Global Innovation: The Role of PCT Practice and Strategy spkowalski
 
FITT Toolbox: Protection
FITT Toolbox: ProtectionFITT Toolbox: Protection
FITT Toolbox: ProtectionFITT
 

Similar to Patent Search: An important new test bed for IR (20)

Intellectual Property Serrvices Outsourcing- India Company Overview
Intellectual Property Serrvices Outsourcing- India Company OverviewIntellectual Property Serrvices Outsourcing- India Company Overview
Intellectual Property Serrvices Outsourcing- India Company Overview
 
Patent search analysis and report
Patent search analysis and reportPatent search analysis and report
Patent search analysis and report
 
Patent search from product specification final
Patent search from product specification finalPatent search from product specification final
Patent search from product specification final
 
Methods to improve Freedom to Operate analysis
Methods to improve Freedom to Operate analysisMethods to improve Freedom to Operate analysis
Methods to improve Freedom to Operate analysis
 
A Survey Of Automated Hierarchical Classification Of Patents
A Survey Of Automated Hierarchical Classification Of PatentsA Survey Of Automated Hierarchical Classification Of Patents
A Survey Of Automated Hierarchical Classification Of Patents
 
PatAnalyse Presentation
PatAnalyse PresentationPatAnalyse Presentation
PatAnalyse Presentation
 
PatAnalyse presentation
PatAnalyse presentationPatAnalyse presentation
PatAnalyse presentation
 
Process Protection Lieu Final
Process Protection Lieu FinalProcess Protection Lieu Final
Process Protection Lieu Final
 
Chi ham ip-workshop_databases_demo_chile
Chi ham ip-workshop_databases_demo_chileChi ham ip-workshop_databases_demo_chile
Chi ham ip-workshop_databases_demo_chile
 
OTN - Mining the patent system to improve research and its commercialization ...
OTN - Mining the patent system to improve research and its commercialization ...OTN - Mining the patent system to improve research and its commercialization ...
OTN - Mining the patent system to improve research and its commercialization ...
 
CambridgeIP: Case Studies Of Recent Client Engagements
CambridgeIP: Case Studies Of Recent Client EngagementsCambridgeIP: Case Studies Of Recent Client Engagements
CambridgeIP: Case Studies Of Recent Client Engagements
 
An introduction to patent data
An introduction to patent dataAn introduction to patent data
An introduction to patent data
 
UNH Law/WIPO Summer School: 2017 Patent Information and its Usefulness
UNH Law/WIPO Summer School: 2017 Patent Information and its Usefulness UNH Law/WIPO Summer School: 2017 Patent Information and its Usefulness
UNH Law/WIPO Summer School: 2017 Patent Information and its Usefulness
 
mHealth Israel_ IP Strategy in China_Ehrlich & Fenster
mHealth Israel_ IP Strategy in China_Ehrlich & FenstermHealth Israel_ IP Strategy in China_Ehrlich & Fenster
mHealth Israel_ IP Strategy in China_Ehrlich & Fenster
 
Data Science Application in Business Portfolio & Risk Management
Data Science Application in Business Portfolio & Risk ManagementData Science Application in Business Portfolio & Risk Management
Data Science Application in Business Portfolio & Risk Management
 
PIIP_Patent search & Analysis_20160829
PIIP_Patent search & Analysis_20160829PIIP_Patent search & Analysis_20160829
PIIP_Patent search & Analysis_20160829
 
Pablo Benalcazar: Modern Tools on Patent Thicket Identification
Pablo Benalcazar: Modern Tools on Patent Thicket IdentificationPablo Benalcazar: Modern Tools on Patent Thicket Identification
Pablo Benalcazar: Modern Tools on Patent Thicket Identification
 
Advancing Global Innovation: The Role of PCT Practice and Strategy
Advancing Global Innovation: The Role of PCT Practice and Strategy Advancing Global Innovation: The Role of PCT Practice and Strategy
Advancing Global Innovation: The Role of PCT Practice and Strategy
 
R5 a報告
R5 a報告R5 a報告
R5 a報告
 
FITT Toolbox: Protection
FITT Toolbox: ProtectionFITT Toolbox: Protection
FITT Toolbox: Protection
 

More from Giovanna Roda

Distributed Computing for Everyone
Distributed Computing for EveryoneDistributed Computing for Everyone
Distributed Computing for EveryoneGiovanna Roda
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopGiovanna Roda
 
Introduction to Hadoop part 2
Introduction to Hadoop part 2Introduction to Hadoop part 2
Introduction to Hadoop part 2Giovanna Roda
 
Introduction to Hadoop part1
Introduction to Hadoop part1Introduction to Hadoop part1
Introduction to Hadoop part1Giovanna Roda
 
The need for new paradigms in IT services provisioning
The need for new paradigms in IT services provisioningThe need for new paradigms in IT services provisioning
The need for new paradigms in IT services provisioningGiovanna Roda
 
Apache Spark™ is here to stay
Apache Spark™ is here to stayApache Spark™ is here to stay
Apache Spark™ is here to stayGiovanna Roda
 
Chances and Challenges in Comparing Cross-Language Retrieval Tools
Chances and Challenges in Comparing Cross-Language Retrieval ToolsChances and Challenges in Comparing Cross-Language Retrieval Tools
Chances and Challenges in Comparing Cross-Language Retrieval ToolsGiovanna Roda
 

More from Giovanna Roda (7)

Distributed Computing for Everyone
Distributed Computing for EveryoneDistributed Computing for Everyone
Distributed Computing for Everyone
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Introduction to Hadoop part 2
Introduction to Hadoop part 2Introduction to Hadoop part 2
Introduction to Hadoop part 2
 
Introduction to Hadoop part1
Introduction to Hadoop part1Introduction to Hadoop part1
Introduction to Hadoop part1
 
The need for new paradigms in IT services provisioning
The need for new paradigms in IT services provisioningThe need for new paradigms in IT services provisioning
The need for new paradigms in IT services provisioning
 
Apache Spark™ is here to stay
Apache Spark™ is here to stayApache Spark™ is here to stay
Apache Spark™ is here to stay
 
Chances and Challenges in Comparing Cross-Language Retrieval Tools
Chances and Challenges in Comparing Cross-Language Retrieval ToolsChances and Challenges in Comparing Cross-Language Retrieval Tools
Chances and Challenges in Comparing Cross-Language Retrieval Tools
 

Recently uploaded

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Patent Search: An important new test bed for IR

  • 1. Patent Search: An important new test bed for IR J. Tait, M. Lupu1 H. Berger, G. Roda, M. Dittenbach, A. Pesenhofer2 E. Graf, K. van Rijsbergen3 1 InformationRetrieval Facility Vienna, Austria 2 Matrixware Vienna, Austria 3 University of Glasgow Dept. of Computing Science Glasgow, UK DIR 2009 / Feb. 2-3, 2009
  • 2. Patent Search. Patent search is a highly specialized form of information search. It is characterized by its target data type of information needs legal and economic implications
  • 3. Target data Data for patent retrieval comes mainly from: patent databases from patent authorities (EPO, USPTO, JPO, SIPO, WIPO, etc.) scientific publications prior art databases (IP.com) A new acronym SIPO: State Intellectual Property Office of the Peoples’ Republic of China
  • 4. Target data Characteristics of patent documents multilingual and ’legalese’ non uniform formats some are OCR’d figures, images, chemical formulas, DNA sequences include references to patent and non-patent literature A new acronym NPL: Non-Patent Literature
  • 5. Information Needs. K.H. Atkinson, Towards a more rational patent search paradigm: depending on what group is doing the asking, the types of patent search requested may include simple patentability, clearance to market a product, validity, opposition to a patent being sought by another, infringement watch, creating IP landscapes for business development or R&D, infringement defense, litigation, prosecution support, and creation of portfolios for assignments, investments, mergers and acquisitions [ . . . ]
  • 6. Legal and economic implications. patents are legal documents patent portfolios are assets for enterprises a single patent search can be worth several days of work High recall searches Missing even a single relevant document can have severe financial and economic impact. For example, when a granted patent becomes invalidated because of a document omitted at application time.
  • 7. Introduction Patent Search A modern IR test bed Promoting take up of research Conclusion We have characterized the patent search problem by describing its target data, types of information needs, legal and economic implications. Next: evaluating IR techniques in the patent domain previous initiatives in the area of patent retrieval the CLEF-IP and TREC-Chem initiatives promoting take-up of research Tait et al. Patent Search: An important new test bed for IR
  • 8. Test collections Test collections in Information Retrieval play a pivotal role in the evaluation of retrieval models. Domain-specific test collections already exist for: Web pages news stories legal documents blogs genomics patents
  • 9. Pioneering work in patent retrieval. Patent retrieval task at the NTCIR Workshop1 since 2001. produced test collections primarily targeting Japanese patents retrieval tasks ad-hoc (goal: find patents on a given topic) invalidity search (goal: find patents invalidating a given claim) patent classification according to the F-term system Two new acronyms F-term (abbreviation of File-forming term) is the classification system used in Japan as a complement to IPC (International Patent Classification) 1 http://research.nii.ac.jp/ntcir
  • 10. Evaluation tracks. The IRF has engaged in two pilot evaluation tracks on patent retrieval CLEF-IP www.ir-facility.org/the_irf/clef-ip09-track TREC-Chem www.ir-facility.org/the_irf/trec_chem.htm
  • 11. CLEF-Intellectual Property Initiative. CLEF-IP coordinated by the IRF part of the Cross-Language Evaluation Forum2 will focus on the task of prior art search European patents as target data automatic extraction of relevance assessments Prior art search Prior art search consists in identifying all information (including NPL) that might be relevant to a patent’s claim of novelty. 2 http://www.clef-campaign.org
  • 12. Prior art search. The most common type of patent search. Performed at various stages of the patent life-cycle and with different intentions: before filing an application (novelty search or patentability search) to determine whether the invention fulfills the requirements of novelty inventive step before grant - results go into a search report attached to patent invalidity search: post-grant search used to unveil prior art that invalidates a patent’s claims of originality
  • 13. Target data. The CLEF-IP evaluation track will restrict target data to patents. Target data: comprising 16 years (filing date between 1985 and 2000) of EPO patents 1.9 million patent documents corresponding to 1 million patents 75 GB, in XML format documents are in English, German, and French
  • 14. Automatic extraction of relevance assessments. The data resulting from prior art searches is saved in the EPO or USPTO databases as: citations in patent applications citations in search report citations in opposition’s legal files The CLEF-IP track is going to extract this information (as much as possible) automatically in order to form a large set of topics.
  • 15. Prior art from opposition procedures. According to the European patent law, a granted patent may be opposed. It is often the case that opponent provides new prior art that invalidates claim of originality of the invention. Patents cited in opposition procedures are very relevant prior art documents. They are the results of a very thorough invalidity search.
  • 16. Crowdsourcing extraction of relevance assessments. Need to extract citations from documents arising from opposition procedures These documents are only are available as scanned images3 Will be using crowdsourcing for extracting these citations. A new word from business jargon Crowdsourcing. 3 at http://www.epoline.org
  • 17. Relevance and evaluation measures. Labels used in search reports: label means that cited document is X relevant when taken alone Y relevant in combination with other documents A relevant but not prejudicial to novelty or inventive step How to use these labels for defining new evaluation measures?
  • 18. Challenges. As a result of the CLEF-IP track we expect to obtain new insights on: how to represent information need given by a patent query reformulation evaluation metrics for patent retrieval using machine translation for improving retrieval effectiveness
  • 19. TREC Chemistry track. Ad-hoc search Target data: academic papers (Royal Society of Chemistry) chemical patent documents (class C in the IPC) Will use automatic extraction of citations for relevance assessments Challenges: chemical names and structures chemical interactions, relations, transformations, properties
  • 20. Introduction Patent Search Pioneering work at NTCIR A modern IR test bed CLEF-IP Promoting take up of research TREC-Chem Conclusion The IRF is contributing to the creation of new patent test collections by organizing two tracks within the CLEF and TREC evaluation campaigns. In addition to the TREC and CLEF contributions, the IRF, together with Matrixware, is promoting several initiatives aimed at facilitating and improving the patent retrieval process. Tait et al. Patent Search: An important new test bed for IR
  • 21. Introduction The IRF Patent Search Matrixware A modern IR test bed Promoting research Promoting take up of research Providing the tools Conclusion Current University Projects Promoting take up of research Next: presentation of the IRF and Matrixware promoting take up of research the IRF symposium the PaIR workshop providing the tools funding research in the area of patent retrieval Tait et al. Patent Search: An important new test bed for IR
  • 22. IRF: the Information Retrieval Facility. New international not-for-profit foundation, based in Vienna, Its mission: to bridge the gap between the needs of the industry and the academic know-how to promote and facilitate research in large scale information retrieval maintain a facility that enables large scale information retrieval and in-depth data processing
  • 23. Matrixware. Founded 2005 in Vienna 80 Employees > 15 Academic Partners Worldwide Implements solutions for access to patent information
  • 24. Promoting research. Matrixware and the IRF have engaged in several initiatives aimed at promoting research and raising awareness in the area of patent retrieval. the Information Retrieval Facility Symposium an annual symposium held in Vienna to foster knowledge exchange between IR experts and IP professionals the PaIR workshop a workshop on Patent Information retrieval hosted by the CIKM conference
  • 25. Providing the tools. Successful IR research conventionally depends on three elements: 1 the availability of test collections 2 access to suitable software systems on which to run experiments 3 access to sufficiently powerful hardware The IRF, supported by Matrixware, is providing all three of these.
  • 26. Current University Projects. Accessibility of Information (Glasgow) Large Scale Logical Retrieval (Glasgow) Semantic Analysis of Patent Data (Sheffield and Nijmegen) Language Modeling for Patent Retrieval (Umass Amherst) OCR for patents (Umass Amherst)
  • 27. Concluding remarks Patent retrieval is an interesting and important open challenge for IR researchers. The IRF and Matrixware have engaged in several projects aimed at promoting research in this area.
  • 28. Introduction Patent Search Concluding remarks A modern IR test bed Invitation Promoting take up of research Closing Conclusion Invitation. You are invited to: join one of the evaluation tracks CLEF-IP TREC-Chem participate in the PaIR workshop participate in the Information Retrieval Facility Symposium Tait et al. Patent Search: An important new test bed for IR
  • 29. Thank you for your attention.