SlideShare a Scribd company logo
1 of 65
Download to read offline
Camomile : A Unicode library for OCaml

                   Yoriyuki Yamagata

  National Institute of Advanced Science and Technology (AIST)


        ML Workshop, September 18, 2011
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
Overview - functionality
Overview - functionality
   Camomile - A Unicode library for OCaml
Overview - functionality
   Camomile - A Unicode library for OCaml
      Unicode character type
Overview - functionality
   Camomile - A Unicode library for OCaml
      Unicode character type
      UTF-8, UTF-16, UTF-32 strings
Overview - functionality
   Camomile - A Unicode library for OCaml
      Unicode character type
      UTF-8, UTF-16, UTF-32 strings
      Conversion to/from approx 200 encodings
Overview - functionality
   Camomile - A Unicode library for OCaml
      Unicode character type
      UTF-8, UTF-16, UTF-32 strings
      Conversion to/from approx 200 encodings
      Case mapping
Overview - functionality
   Camomile - A Unicode library for OCaml
      Unicode character type
      UTF-8, UTF-16, UTF-32 strings
      Conversion to/from approx 200 encodings
      Case mapping
      Collation (sort and search)
Overview - feature
Overview - feature
      Only support “logical” operations
Overview - feature
      Only support “logical” operations
      No support for rendering or formatting
Overview - feature
      Only support “logical” operations
      No support for rendering or formatting
      Purely written in OCaml
Overview - feature
      Only support “logical” operations
      No support for rendering or formatting
      Purely written in OCaml
      Functors and lazy evaluation play crucial roles
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
ASCII to Unicode : challenge of multilingualization
ASCII to Unicode : challenge of multilingualization
   Large number of characters
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
              ä=a+¨
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
              ä=a+¨
                   ˜
              Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
              ä=a+¨
                   ˜
              Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
              â=a+.+ˆ=a+ˆ+.
               .
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
                ä=a+¨
                      ˜
                Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
                â=a+.+ˆ=a+ˆ+.
                .
   Diverse cultural conventions
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
                ä=a+¨
                      ˜
                Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
                â=a+.+ˆ=a+ˆ+.
                .
   Diverse cultural conventions
                Case mapping OΣOΣ → oσoς (Greek)
ASCII to Unicode : challenge of multilingualization
   Large number of characters
              code range 0x0 - 0x10ffff
   Multiple representation of strings
                UTF-8, UTF-16 and UTF-32
                legacy encodings
   Combining characters
                ä=a+¨
                      ˜
                Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
                â=a+.+ˆ=a+ˆ+.
                .
   Diverse cultural conventions
                Case mapping OΣOΣ → oσoς (Greek)
                     Sorting ... < H < CH < I < ... (Slovak)
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
Unicode normal forms - what is it?
Unicode normal forms - what is it?


   Unicode has multiple representations of “same” strings.
Unicode normal forms - what is it?


   Unicode has multiple representations of “same” strings.
   E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc.
        . .
Unicode normal forms - what is it?


   Unicode has multiple representations of “same” strings.
   E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc.
        . .
   Normal forms give the unique representations
   There are 4 normal forms
    1. NFD
    2. NFC
    3. NFKD
    4. NFKC
Unicode normal forms - what is it?


   Unicode has multiple representations of “same” strings.
   E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc.
        . .
   Normal forms give the unique representations
   There are 4 normal forms
    1. NFD
    2. NFC
    3. NFKD
    4. NFKC

   We concentrate NFD
Unicode normal form - NFD
Unicode normal form - NFD




   1. Decompose characters as much as possible
            â⇒a+ˆ ⇒a+.+ˆ
             .   .
Unicode normal form - NFD




   1. Decompose characters as much as possible
            â⇒a+ˆ ⇒a+.+ˆ
             .   .
   2. Do stable sort on combining characters based on
      combining class
              a+.+ˆ ⇒a+.+ˆ
Camomile strings - UTF8, UTF16, UCS4
Camomile strings - UTF8, UTF16, UCS4
  UTF8
  UTF-8 string as a string
Camomile strings - UTF8, UTF16, UCS4
  UTF8
  UTF-8 string as a string

  UTF16
  UTF-16 string as an unsigned 16-bit integer bigarray
Camomile strings - UTF8, UTF16, UCS4
  UTF8
  UTF-8 string as a string

  UTF16
  UTF-16 string as an unsigned 16-bit integer bigarray

  UCS4
  UTF-32 string as a 32-bit integer bigarray
Camomile strings - UTF8, UTF16, UCS4
  UTF8
  UTF-8 string as a string

  UTF16
  UTF-16 string as an unsigned 16-bit integer bigarray

  UCS4
  UTF-32 string as a 32-bit integer bigarray

  UnicodeString.Type
  UTF-8/16 and UCS4 all confirm UnicodeString.Type
  String operations are functors over UnicodeString.Type
Camomile modules - UNF
  Module for Unicode normal form
       module type Type =
       sig
         type text

         val   nfd : text -> text
         val   nfkd : text -> text
         val   nfc : text -> text
         val   nfkc : text -> text

         val canon_compare : text -> text -> int
       end

       module Make (Text : UnicodeString.Type) :
         Type with type text = Text.t and
         type index = Text.index
Camomile modules - UNF
  Create a module for a given Unicode string
        module type Type =
        sig
          type text

          val   nfd : text -> text
          val   nfkd : text -> text
          val   nfc : text -> text
          val   nfkc : text -> text

          val canon_compare : text -> text -> int
        end

        module Make (Text : UnicodeString.Type) :
          Type with type text = Text.t and
          type index = Text.index
Camomile modules - UNF
  Conversion to NFD
       module type Type =
       sig
         type text

         val   nfd : text -> text
         val   nfkd : text -> text
         val   nfc : text -> text
         val   nfkc : text -> text

         val canon_compare : text -> text -> int
       end

       module Make (Text : UnicodeString.Type) :
         Type with type text = Text.t and
         type index = Text.index
Camomile modules - UNF
  Compare strings by semantic equivalence
       module type Type =
       sig
         type text

         val   nfd : text -> text
         val   nfkd : text -> text
         val   nfc : text -> text
         val   nfkc : text -> text

         val canon_compare : text -> text -> int
       end

       module Make (Text : UnicodeString.Type) :
         Type with type text = Text.t and
         type index = Text.index
Camomile modules - UNF
  By lazily building NFD and compare them
       module type Type =
       sig
         type text

         val   nfd : text -> text
         val   nfkd : text -> text
         val   nfc : text -> text
         val   nfkc : text -> text

         val canon_compare : text -> text -> int
       end

       module Make (Text : UnicodeString.Type) :
         Type with type text = Text.t and
         type index = Text.index
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
ulib - a yet another Unicode library
   Now under development
ulib - a yet another Unicode library
   ulib is compact
ulib - a yet another Unicode library
   ulib is compact
       Minimum functionalities
ulib - a yet another Unicode library
   ulib is compact
       Minimum functionalities
       No data file
ulib - a yet another Unicode library
   ulib is compact
       Minimum functionalities
       No data file
       No initialization
ulib - a yet another Unicode library
   ulib is modern
ulib - a yet another Unicode library
   ulib is modern
       Rope for Unicode string
ulib - a yet another Unicode library
   ulib is modern
       Rope for Unicode string
       Zipper for indexing rope
ulib - a yet another Unicode library
   ulib is modern
       Rope for Unicode string
       Zipper for indexing rope
       Pluggable code converter using first class modules
Outline

   Overview


   ASCII to Unicode : A challenge of multilingualization


   Example : Unicode normal forms


   ulib


   Conclusion
Conclusion
Conclusion
     Unicode is different from ASCII
Conclusion
     Unicode is different from ASCII
     Camomile addresses a "logical" part of Unicode
Conclusion
     Unicode is different from ASCII
     Camomile addresses a "logical" part of Unicode
     Functors and lazyness play crucial roles
Conclusion
     Unicode is different from ASCII
     Camomile addresses a "logical" part of Unicode
     Functors and lazyness play crucial roles
     More simplified library "ulib" is now under development.
Project URL




   Camomile https://github.com/yoriyuki/Camomile
         ulib https://github.com/yoriyuki/ulib

More Related Content

Viewers also liked

Using functional programming within an industrial product group: perspectives...
Using functional programming within an industrial product group: perspectives...Using functional programming within an industrial product group: perspectives...
Using functional programming within an industrial product group: perspectives...Anil Madhavapeddy
 
Introduction to functional programming using Ocaml
Introduction to functional programming using OcamlIntroduction to functional programming using Ocaml
Introduction to functional programming using Ocamlpramode_ce
 
Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)Anil Madhavapeddy
 
An Introduction to Functional Programming using Haskell
An Introduction to Functional Programming using HaskellAn Introduction to Functional Programming using Haskell
An Introduction to Functional Programming using HaskellMichel Rijnders
 
Introduction to haskell
Introduction to haskellIntroduction to haskell
Introduction to haskellLuca Molteni
 
OCamlでWebアプリケーションを作るn個の方法
OCamlでWebアプリケーションを作るn個の方法OCamlでWebアプリケーションを作るn個の方法
OCamlでWebアプリケーションを作るn個の方法Hiroki Mizuno
 
Os Peytonjones
Os PeytonjonesOs Peytonjones
Os Peytonjonesoscon2007
 
OCaml Labs introduction at OCaml Consortium 2012
OCaml Labs introduction at OCaml Consortium 2012OCaml Labs introduction at OCaml Consortium 2012
OCaml Labs introduction at OCaml Consortium 2012Anil Madhavapeddy
 
Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!Kel Cecil
 
Real World OCamlを読んでLispと協調してみた
Real World OCamlを読んでLispと協調してみたReal World OCamlを読んでLispと協調してみた
Real World OCamlを読んでLispと協調してみたblackenedgold
 
関数型プログラミング入門 with OCaml
関数型プログラミング入門 with OCaml関数型プログラミング入門 with OCaml
関数型プログラミング入門 with OCamlHaruka Oikawa
 
PythonistaがOCamlを実用する方法
PythonistaがOCamlを実用する方法PythonistaがOCamlを実用する方法
PythonistaがOCamlを実用する方法Yosuke Onoue
 
Neural Turing Machine Tutorial
Neural Turing Machine TutorialNeural Turing Machine Tutorial
Neural Turing Machine TutorialMark Chang
 

Viewers also liked (20)

A taste of Functional Programming
A taste of Functional ProgrammingA taste of Functional Programming
A taste of Functional Programming
 
Ocaml
OcamlOcaml
Ocaml
 
Using functional programming within an industrial product group: perspectives...
Using functional programming within an industrial product group: perspectives...Using functional programming within an industrial product group: perspectives...
Using functional programming within an industrial product group: perspectives...
 
Introduction to functional programming using Ocaml
Introduction to functional programming using OcamlIntroduction to functional programming using Ocaml
Introduction to functional programming using Ocaml
 
Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)Mirage: ML kernels in the cloud (ML Workshop 2010)
Mirage: ML kernels in the cloud (ML Workshop 2010)
 
Haskell - Functional Programming
Haskell - Functional ProgrammingHaskell - Functional Programming
Haskell - Functional Programming
 
An Introduction to Functional Programming using Haskell
An Introduction to Functional Programming using HaskellAn Introduction to Functional Programming using Haskell
An Introduction to Functional Programming using Haskell
 
計算数学
計算数学計算数学
計算数学
 
Lispmeetup11
Lispmeetup11Lispmeetup11
Lispmeetup11
 
Introduction to haskell
Introduction to haskellIntroduction to haskell
Introduction to haskell
 
OCamlでWebアプリケーションを作るn個の方法
OCamlでWebアプリケーションを作るn個の方法OCamlでWebアプリケーションを作るn個の方法
OCamlでWebアプリケーションを作るn個の方法
 
Os Peytonjones
Os PeytonjonesOs Peytonjones
Os Peytonjones
 
OCaml Labs introduction at OCaml Consortium 2012
OCaml Labs introduction at OCaml Consortium 2012OCaml Labs introduction at OCaml Consortium 2012
OCaml Labs introduction at OCaml Consortium 2012
 
Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!
 
Real World OCamlを読んでLispと協調してみた
Real World OCamlを読んでLispと協調してみたReal World OCamlを読んでLispと協調してみた
Real World OCamlを読んでLispと協調してみた
 
関数型プログラミング入門 with OCaml
関数型プログラミング入門 with OCaml関数型プログラミング入門 with OCaml
関数型プログラミング入門 with OCaml
 
PythonistaがOCamlを実用する方法
PythonistaがOCamlを実用する方法PythonistaがOCamlを実用する方法
PythonistaがOCamlを実用する方法
 
Why Haskell
Why HaskellWhy Haskell
Why Haskell
 
Neural Turing Machine Tutorial
Neural Turing Machine TutorialNeural Turing Machine Tutorial
Neural Turing Machine Tutorial
 
Object-oriented Basics
Object-oriented BasicsObject-oriented Basics
Object-oriented Basics
 

Similar to Camomile : A Unicode library for OCaml

Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - ITguest6ddfb98
 
Overview of character encoding
Overview of character encodingOverview of character encoding
Overview of character encodingDuy Lâm
 
Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptAlula Tafere
 
Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeUlf Mattsson
 
Unicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsUnicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsRay Paseur
 
SignWriting in Unicode dot SWU
SignWriting in Unicode dot SWUSignWriting in Unicode dot SWU
SignWriting in Unicode dot SWUStephen Slevinski
 
Unicode and character sets
Unicode and character setsUnicode and character sets
Unicode and character setsrenchenyu
 
Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Jerome Eteve
 
Unicode for Small Children (and Children at Heart)
Unicode for Small Children (and Children at Heart)Unicode for Small Children (and Children at Heart)
Unicode for Small Children (and Children at Heart)Feihong Hsu
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xmlphanleson
 
Type हिन्दी in Java
Type हिन्दी in JavaType हिन्दी in Java
Type हिन्दी in Javagagmansa
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode formatAdityaSharma1452
 
Encodings - Ruby 1.8 and Ruby 1.9
Encodings - Ruby 1.8 and Ruby 1.9Encodings - Ruby 1.8 and Ruby 1.9
Encodings - Ruby 1.8 and Ruby 1.9Dimelo R&D Team
 
Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Ulf Mattsson
 
SignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerationsSignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerationsStephen Slevinski
 

Similar to Camomile : A Unicode library for OCaml (20)

Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - IT
 
Overview of character encoding
Overview of character encodingOverview of character encoding
Overview of character encoding
 
Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
 
Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicode
 
Unicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsUnicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set Collisions
 
Character Sets
Character SetsCharacter Sets
Character Sets
 
SignWriting in Unicode dot SWU
SignWriting in Unicode dot SWUSignWriting in Unicode dot SWU
SignWriting in Unicode dot SWU
 
Unicode and character sets
Unicode and character setsUnicode and character sets
Unicode and character sets
 
String Encodings
String EncodingsString Encodings
String Encodings
 
Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)
 
Unicode for Small Children (and Children at Heart)
Unicode for Small Children (and Children at Heart)Unicode for Small Children (and Children at Heart)
Unicode for Small Children (and Children at Heart)
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xml
 
Type हिन्दी in Java
Type हिन्दी in JavaType हिन्दी in Java
Type हिन्दी in Java
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode format
 
Encodings - Ruby 1.8 and Ruby 1.9
Encodings - Ruby 1.8 and Ruby 1.9Encodings - Ruby 1.8 and Ruby 1.9
Encodings - Ruby 1.8 and Ruby 1.9
 
SignWriting in Unicode Next
SignWriting in Unicode NextSignWriting in Unicode Next
SignWriting in Unicode Next
 
Uncdtalk
UncdtalkUncdtalk
Uncdtalk
 
Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...
 
Unicode basics in python
Unicode basics in pythonUnicode basics in python
Unicode basics in python
 
SignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerationsSignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerations
 

More from Yamagata Yoriyuki

ヴォイニッチ手稿と私
ヴォイニッチ手稿と私ヴォイニッチ手稿と私
ヴォイニッチ手稿と私Yamagata Yoriyuki
 
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析Scalaによるドメイン特化言語を使ったソフトウェアの動作解析
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析Yamagata Yoriyuki
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticYamagata Yoriyuki
 
Runtime verification based on CSP
Runtime verification based on CSPRuntime verification based on CSP
Runtime verification based on CSPYamagata Yoriyuki
 
CSPを用いたログ解析その他
CSPを用いたログ解析その他CSPを用いたログ解析その他
CSPを用いたログ解析その他Yamagata Yoriyuki
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticYamagata Yoriyuki
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticYamagata Yoriyuki
 
Rubyでデータマイニング: RubyKaigi2007ライトニングトーク
Rubyでデータマイニング: RubyKaigi2007ライトニングトークRubyでデータマイニング: RubyKaigi2007ライトニングトーク
Rubyでデータマイニング: RubyKaigi2007ライトニングトークYamagata Yoriyuki
 
CSPによる並行システムの検証(2)
CSPによる並行システムの検証(2)CSPによる並行システムの検証(2)
CSPによる並行システムの検証(2)Yamagata Yoriyuki
 
CSPによるコンカレントシステムの検証(1)
CSPによるコンカレントシステムの検証(1)CSPによるコンカレントシステムの検証(1)
CSPによるコンカレントシステムの検証(1)Yamagata Yoriyuki
 
Bounded arithmetic in free logic
Bounded arithmetic in free logicBounded arithmetic in free logic
Bounded arithmetic in free logicYamagata Yoriyuki
 
Bounded arithmetic in free logic
Bounded arithmetic in free logicBounded arithmetic in free logic
Bounded arithmetic in free logicYamagata Yoriyuki
 
Camomile - OCaml用Unicodeライブラリ
Camomile - OCaml用UnicodeライブラリCamomile - OCaml用Unicodeライブラリ
Camomile - OCaml用UnicodeライブラリYamagata Yoriyuki
 
Google 日本語入力 TechTalk 2010
Google 日本語入力 TechTalk 2010Google 日本語入力 TechTalk 2010
Google 日本語入力 TechTalk 2010Yamagata Yoriyuki
 

More from Yamagata Yoriyuki (19)

ヴォイニッチ手稿と私
ヴォイニッチ手稿と私ヴォイニッチ手稿と私
ヴォイニッチ手稿と私
 
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析Scalaによるドメイン特化言語を使ったソフトウェアの動作解析
Scalaによるドメイン特化言語を使ったソフトウェアの動作解析
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmetic
 
モデル検査紹介
モデル検査紹介モデル検査紹介
モデル検査紹介
 
Runtime verification based on CSP
Runtime verification based on CSPRuntime verification based on CSP
Runtime verification based on CSP
 
CSPを用いたログ解析その他
CSPを用いたログ解析その他CSPを用いたログ解析その他
CSPを用いたログ解析その他
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmetic
 
Consistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmeticConsistency proof of a feasible arithmetic inside a bounded arithmetic
Consistency proof of a feasible arithmetic inside a bounded arithmetic
 
OCamlとUnicode
OCamlとUnicodeOCamlとUnicode
OCamlとUnicode
 
Rubyでデータマイニング: RubyKaigi2007ライトニングトーク
Rubyでデータマイニング: RubyKaigi2007ライトニングトークRubyでデータマイニング: RubyKaigi2007ライトニングトーク
Rubyでデータマイニング: RubyKaigi2007ライトニングトーク
 
CSPによる並行システムの検証(2)
CSPによる並行システムの検証(2)CSPによる並行システムの検証(2)
CSPによる並行システムの検証(2)
 
CSPによるコンカレントシステムの検証(1)
CSPによるコンカレントシステムの検証(1)CSPによるコンカレントシステムの検証(1)
CSPによるコンカレントシステムの検証(1)
 
Bounded arithmetic in free logic
Bounded arithmetic in free logicBounded arithmetic in free logic
Bounded arithmetic in free logic
 
Bounded arithmetic in free logic
Bounded arithmetic in free logicBounded arithmetic in free logic
Bounded arithmetic in free logic
 
UML&FM 2012
UML&FM 2012UML&FM 2012
UML&FM 2012
 
Translating STM to CSP
Translating STM to CSPTranslating STM to CSP
Translating STM to CSP
 
Camomile - OCaml用Unicodeライブラリ
Camomile - OCaml用UnicodeライブラリCamomile - OCaml用Unicodeライブラリ
Camomile - OCaml用Unicodeライブラリ
 
Google 日本語入力 TechTalk 2010
Google 日本語入力 TechTalk 2010Google 日本語入力 TechTalk 2010
Google 日本語入力 TechTalk 2010
 
CamomileでUnicode
CamomileでUnicodeCamomileでUnicode
CamomileでUnicode
 

Recently uploaded

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 

Recently uploaded (20)

[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 

Camomile : A Unicode library for OCaml

  • 1. Camomile : A Unicode library for OCaml Yoriyuki Yamagata National Institute of Advanced Science and Technology (AIST) ML Workshop, September 18, 2011
  • 2. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 3. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 5. Overview - functionality Camomile - A Unicode library for OCaml
  • 6. Overview - functionality Camomile - A Unicode library for OCaml Unicode character type
  • 7. Overview - functionality Camomile - A Unicode library for OCaml Unicode character type UTF-8, UTF-16, UTF-32 strings
  • 8. Overview - functionality Camomile - A Unicode library for OCaml Unicode character type UTF-8, UTF-16, UTF-32 strings Conversion to/from approx 200 encodings
  • 9. Overview - functionality Camomile - A Unicode library for OCaml Unicode character type UTF-8, UTF-16, UTF-32 strings Conversion to/from approx 200 encodings Case mapping
  • 10. Overview - functionality Camomile - A Unicode library for OCaml Unicode character type UTF-8, UTF-16, UTF-32 strings Conversion to/from approx 200 encodings Case mapping Collation (sort and search)
  • 12. Overview - feature Only support “logical” operations
  • 13. Overview - feature Only support “logical” operations No support for rendering or formatting
  • 14. Overview - feature Only support “logical” operations No support for rendering or formatting Purely written in OCaml
  • 15. Overview - feature Only support “logical” operations No support for rendering or formatting Purely written in OCaml Functors and lazy evaluation play crucial roles
  • 16. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 17. ASCII to Unicode : challenge of multilingualization
  • 18. ASCII to Unicode : challenge of multilingualization Large number of characters
  • 19. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff
  • 20. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings
  • 21. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32
  • 22. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings
  • 23. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters
  • 24. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨
  • 25. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨ ˜ Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en
  • 26. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨ ˜ Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en â=a+.+ˆ=a+ˆ+. .
  • 27. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨ ˜ Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en â=a+.+ˆ=a+ˆ+. . Diverse cultural conventions
  • 28. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨ ˜ Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en â=a+.+ˆ=a+ˆ+. . Diverse cultural conventions Case mapping OΣOΣ → oσoς (Greek)
  • 29. ASCII to Unicode : challenge of multilingualization Large number of characters code range 0x0 - 0x10ffff Multiple representation of strings UTF-8, UTF-16 and UTF-32 legacy encodings Combining characters ä=a+¨ ˜ Nguyên = Nguyê + ˜ + en = Nguye + ˆ + ˜ + en â=a+.+ˆ=a+ˆ+. . Diverse cultural conventions Case mapping OΣOΣ → oσoς (Greek) Sorting ... < H < CH < I < ... (Slovak)
  • 30. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 31. Unicode normal forms - what is it?
  • 32. Unicode normal forms - what is it? Unicode has multiple representations of “same” strings.
  • 33. Unicode normal forms - what is it? Unicode has multiple representations of “same” strings. E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc. . .
  • 34. Unicode normal forms - what is it? Unicode has multiple representations of “same” strings. E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc. . . Normal forms give the unique representations There are 4 normal forms 1. NFD 2. NFC 3. NFKD 4. NFKC
  • 35. Unicode normal forms - what is it? Unicode has multiple representations of “same” strings. E.g. â = a + ˆ = a + . + ˆ = a + ˆ + . etc. . . Normal forms give the unique representations There are 4 normal forms 1. NFD 2. NFC 3. NFKD 4. NFKC We concentrate NFD
  • 37. Unicode normal form - NFD 1. Decompose characters as much as possible â⇒a+ˆ ⇒a+.+ˆ . .
  • 38. Unicode normal form - NFD 1. Decompose characters as much as possible â⇒a+ˆ ⇒a+.+ˆ . . 2. Do stable sort on combining characters based on combining class a+.+ˆ ⇒a+.+ˆ
  • 39. Camomile strings - UTF8, UTF16, UCS4
  • 40. Camomile strings - UTF8, UTF16, UCS4 UTF8 UTF-8 string as a string
  • 41. Camomile strings - UTF8, UTF16, UCS4 UTF8 UTF-8 string as a string UTF16 UTF-16 string as an unsigned 16-bit integer bigarray
  • 42. Camomile strings - UTF8, UTF16, UCS4 UTF8 UTF-8 string as a string UTF16 UTF-16 string as an unsigned 16-bit integer bigarray UCS4 UTF-32 string as a 32-bit integer bigarray
  • 43. Camomile strings - UTF8, UTF16, UCS4 UTF8 UTF-8 string as a string UTF16 UTF-16 string as an unsigned 16-bit integer bigarray UCS4 UTF-32 string as a 32-bit integer bigarray UnicodeString.Type UTF-8/16 and UCS4 all confirm UnicodeString.Type String operations are functors over UnicodeString.Type
  • 44. Camomile modules - UNF Module for Unicode normal form module type Type = sig type text val nfd : text -> text val nfkd : text -> text val nfc : text -> text val nfkc : text -> text val canon_compare : text -> text -> int end module Make (Text : UnicodeString.Type) : Type with type text = Text.t and type index = Text.index
  • 45. Camomile modules - UNF Create a module for a given Unicode string module type Type = sig type text val nfd : text -> text val nfkd : text -> text val nfc : text -> text val nfkc : text -> text val canon_compare : text -> text -> int end module Make (Text : UnicodeString.Type) : Type with type text = Text.t and type index = Text.index
  • 46. Camomile modules - UNF Conversion to NFD module type Type = sig type text val nfd : text -> text val nfkd : text -> text val nfc : text -> text val nfkc : text -> text val canon_compare : text -> text -> int end module Make (Text : UnicodeString.Type) : Type with type text = Text.t and type index = Text.index
  • 47. Camomile modules - UNF Compare strings by semantic equivalence module type Type = sig type text val nfd : text -> text val nfkd : text -> text val nfc : text -> text val nfkc : text -> text val canon_compare : text -> text -> int end module Make (Text : UnicodeString.Type) : Type with type text = Text.t and type index = Text.index
  • 48. Camomile modules - UNF By lazily building NFD and compare them module type Type = sig type text val nfd : text -> text val nfkd : text -> text val nfc : text -> text val nfkc : text -> text val canon_compare : text -> text -> int end module Make (Text : UnicodeString.Type) : Type with type text = Text.t and type index = Text.index
  • 49. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 50. ulib - a yet another Unicode library Now under development
  • 51. ulib - a yet another Unicode library ulib is compact
  • 52. ulib - a yet another Unicode library ulib is compact Minimum functionalities
  • 53. ulib - a yet another Unicode library ulib is compact Minimum functionalities No data file
  • 54. ulib - a yet another Unicode library ulib is compact Minimum functionalities No data file No initialization
  • 55. ulib - a yet another Unicode library ulib is modern
  • 56. ulib - a yet another Unicode library ulib is modern Rope for Unicode string
  • 57. ulib - a yet another Unicode library ulib is modern Rope for Unicode string Zipper for indexing rope
  • 58. ulib - a yet another Unicode library ulib is modern Rope for Unicode string Zipper for indexing rope Pluggable code converter using first class modules
  • 59. Outline Overview ASCII to Unicode : A challenge of multilingualization Example : Unicode normal forms ulib Conclusion
  • 61. Conclusion Unicode is different from ASCII
  • 62. Conclusion Unicode is different from ASCII Camomile addresses a "logical" part of Unicode
  • 63. Conclusion Unicode is different from ASCII Camomile addresses a "logical" part of Unicode Functors and lazyness play crucial roles
  • 64. Conclusion Unicode is different from ASCII Camomile addresses a "logical" part of Unicode Functors and lazyness play crucial roles More simplified library "ulib" is now under development.
  • 65. Project URL Camomile https://github.com/yoriyuki/Camomile ulib https://github.com/yoriyuki/ulib