SlideShare a Scribd company logo
1 of 24
Download to read offline
2014年4月6日
CodeIQ
「クリプタン帝国の暗号文を解読しよう!」
問1の解答例
前原 貴憲 (@tmaehara)
問題
何らかの文章(アスキーコード)を換字式暗号で暗号化した
列s = s1 . . . sn が与えられる.復号化せよ.
換字式暗号:
before after
a p
b c
c z
...
...
z m
before: s = “abababc”
⇓
after: T(s) =“pcpcpcz”
対応表T に従って一文字ずつ変換する(T は非公開)
1/ 23
解答例
2/ 23
方針:頻度攻撃
自然言語の文書には「文字頻度の偏り」がある
⇒ 頻度攻撃:暗号文中の文字を数えて対応関係を探す
(例:一番多い文字は e,次は t, ...)
(http://en.wikipedia.org/wiki/Letter_frequency)
3/ 23
解答例
1. 1文字の出現頻度からスペースと e を特定
2. 2文字組の出現頻度から th と in を特定
3. with っぽい部分から w を特定
4. …
5. ある程度文章らしくなったら Google に投げる
このような try-and-error で10分そこらで復号化可能
答え = Sharlock Holmesの一節
データだけからは厳密には判別困難なものもあるので
最後は文脈や出題者の好みなどを考慮する必要あり
(例: ” or ’ / Watson or Hatson or Datson)
4/ 23
ここから本題
「頻度攻撃」をもっと「高級」に
5/ 23
尤度攻撃
言語中に文字列tが出現する確率P (t)を考える
このとき,換字表T のもっともらしさ(尤度)は P (T (s))
尤度攻撃:尤度最大化問題
maximize P (T (s))
を解くことで,復号化する手法
Q1. 文字列の出現確率P (t)はどう計算(推定)する?
Q2. どうやって尤度最適化問題を解く?
6/ 23
出現確率P (t)の推定
7/ 23
言語モデル
本当の文字列出現確率P (t)はもちろん未知
⇒ 適当にモデル化したもの:(確率的)言語モデル
(このモデル化は,自然言語処理の最も基本的な問題の1つ)
典型的なモデル:t = t1t2 · · · tn として
• 1-gram モデル:P (t) = p(t1) · p(tn)
(各文字が独立にランダム出現する)
• 2-gram モデル:P (t) = p(t1)p(t2|t1) · · · p(tn|tn−1)
(前の文字に依存してランダム出現する)
• その他,さまざまなモデルが存在
今回は2-gramモデルを加法的スムージングしたものを使う
8/ 23
2-gramモデル
「いまの文字に依存して次の文字の確率が決まる」モデル
p(a|b):文字bの次に文字aが出現する確率として
P (t) = p(t1| )p(t2|t1) · · · p(tn|tn−1)
モデルの推定 = p(a|b)を決めること
推定法
• 適当なテキスト(コーパスと呼ばれる)を用意,
•「文字bの次に文字aが現れた回数f(a|b)」を数える
• 適当に正規化して確率にする:
p(a|b) =
f(a|b)
∑
c f(a|c)
9/ 23
2-gramモデル+加法的スムージング
2-gramモデルの問題点:
低頻度文字に対するp(a|b)の推定精度が低い
例:手持ちのコーパスに出なかった文字の組は
  将来的にもずっと出現しないとしてよいか?
加法的スムージング:すべての頻度に適当な定数kを加える
˜f(a|b) = f(a|b) + k
• 頻度が高いものは,生の頻度に近い
• 頻度が低いものは,一様分布に近づく
他のモデルと比べて精度は低いが,実装がとても簡単
10/ 23
尤度最大化問題の解き方
11/ 23
尤度最大化問題の解き方
maximize P (T (s))
• 言語モデルP (t)は計算できるようになった
• 尤度P (T (s))をどうやって最大化する?
•「一番良い置換を求めるタイプの問題」の解法
(基本的に NP-hard になるので厳密解法はあきらめる)
– 焼きなまし Simulated Annealing
– タブーサーチ Tabu Search
– 遺伝的アルゴリズム Genetic Algorithm
– ...
ここではタブーサーチを紹介(他の方法でも解けます)
12/ 23
タブーサーチ
1. 適当な初期解T を構成
2. 満足するまで以下を実行
2.1. T から新しい解T1, . . . , Tw を生成
ただし過去h回以内に訪問した解は生成しない
2.2. T1, . . . , Tw のうち最も良い解に移動
3. 訪問した中で最も良いものを出力
• 局所探索法の一種
• 過去h個の記憶(タブーリスト)をもつ
• 目的関数が悪化しても,タブーリストに入っていなければ
移動し,局所最適解に落ちるのを防止する
比較的良い解が出れば十分な場合,特に強い
13/ 23
数値実験
14/ 23
実験
言語モデルのためのコーパスは Project Gutenberg から
• The Adventures of Sharlock Holmes, by Arthur Conan Doyle
• Alice’s Adventures in Wonderland, by Lewis Carroll
• Pride and Prejudice, by Jane Austin
• The Old Testament of the King James Version of the Bible
をダウンロードして使用,違いを観察する
タブーサーチの初期解は頻度から貪欲法で生成
反復数は,解を見ながら適当に打ち切る(1000回程度)
近傍は2要素の交換で生成,近傍サイズw = 100程度
履歴サイズh = 10 程度使用
(計算時間はどれも数十秒∼数分程度)
次頁より,結果を真の解と比較(赤字は復元ミス)
15/ 23
by Adventure of Sherlock Holmes
”You see, my dear Watson,” II he propped his testItube in
the rack, and began to lecture with the air of a professor
addressing his class II ”it is not really difficult to construct
a series of inferences, each dependent upon its predecessor
and each simple in itself. Of, after doing so, one simply
knocks out all the central inferences and presents one’s
audience with the startingIpoint and the conclusion, one
may produce a startling, though possibly a meretricious,
effect. How, it was not really difficult, by an inspection of
the groove between your left forefinger and thumb, to feel
sure that you did HEA propose to invest your small capital
in the gold fields.”
16/ 23
by Alice’s Adventure in Wonderland
’You see, my dear Hatson,’ II he propped his testItube in
the rack, and began to lecture with the air of a professor
addressing his class II ’it is not really difficult to construct a
series of inferences, each dependent upon its predecessor
and each simple in itself. Af, after doing so, one simply
knocks out all the central inferences and presents one-s
audience with the startingIpoint and the conclusion, one
may produce a startling, though possibly a meretricious,
effect. Mow, it was not really difficult, by an inspection of
the groove between your left forefinger and thumb, to feel
sure that you did MER propose to invest your small capital
in the gold fields.’
17/ 23
by Pride and Prejudice
”You see, my dear Datson,” he propped his test tube in
the rack, and began to lecture with the air of a professor
addressing his class ”it is not really difficult to construct
a series of inferences, each dependent upon its predecessor
and each simple in itself. If, after doing so, one simply
knocks out all the central inferences and presents one’s
audience with the starting point and the conclusion, one
may produce a startling, though possibly a meretricious,
effect. Cow, it was not really difficult, by an inspection of
the groove between your left forefinger and thumb, to feel
sure that you did CO; propose to invest your small capital
in the gold fields.”
18/ 23
by The Bible
)Gou see, my dear Satson,) 22 he propped his test2tube in
the rack, and began to lecture with the air of a professor
addressing his class 22 )it is not really difficult to construct
a series of inferences, each dependent upon its predecessor
and each simple in itself; If, after doing so, one simply
knocks out all the central inferences and presents one’s
audience with the starting2point and the conclusion, one
may produce a startling, though possibly a meretricious,
effect; Low, it was not really difficult, by an inspection of
the groove between your left forefinger and thumb, to feel
sure that you did LOD propose to invest your small capital
in the gold fields;)
19/ 23
by The Bible; 青地は原文に未出現単語
)Gou see, my dear Satson,) 22 he propped his test2tube in
the rack, and began to lecture with the air of a professor
addressing his class 22 )it is not really difficult to construct
a series of inferences, each dependent upon its predecessor
and each simple in itself; If, after doing so, one simply
knocks out all the central inferences and presents one’s
audience with the starting2point and the conclusion, one
may produce a startling, though possibly a meretricious,
effect; Low, it was not really difficult, by an inspection of
the groove between your left forefinger and thumb, to feel
sure that you did LOD propose to invest your small capital
in the gold fields;)
20/ 23
まとめ
換字式暗号に対する尤度攻撃
• 言語モデルを設定(2-gram + 加法的スムージング)
• 尤度を最適化(タブーサーチ)
実験結果
• 人が目で見て読める程度には復元可能
(原文に含まれない単語もなんとかなる)
• 固有名詞や記号などは厳しい
(文脈や出題者の好みを考える必要あり)
実験に用いたコード
https://gist.github.com/spaghetti-source/10002001
21/ 23
参考文献1:言語モデル
• 北 研二, 辻井 潤一 (1999): 計算と言語(4)確率的言語モデル, 東大
出版
• http://plata.ar.media.kyoto-u.ac.jp/mori/research/
topics/LM/LM.pdf
• http:
//chasen.org/~daiti-m/paper/naist-dlec2004-lmodel.pdf
22/ 23
参考文献2:尤度最大化
• W. S. Forsyth and R. Safavi-Naini (1993):
Automated cryptanalysis of substitution ciphers. Cryptologia, 17,
4:407-418. 焼き鈍しで換字式暗号を解く
• R. Spillman, M. Janssen, B. Nelson, and M. Kepner (1993):
Use of a genetic algorithm in the cryptanalysis of simple
substitution ciphers. Cryptologia, 17: 1:31-44. 遺伝的アルゴリズム
で解く
• A. K. Verma, M. Dave, and R. C. Joshi (2007):
Genetic algorithm and babu search attack on the mono-alphabetic
subsitution cipher in adhoc networks. タブーサーチで解く
• P. Diaconis (2009): The markov chain monte carlo revolution.
Bulletin of the American Mathematical Society, 46, 2:179-205.
MCMCで解く(≒焼きなまし)
23/ 23

More Related Content

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

クリプタン帝国の暗号文を解読しよう(問1)

  • 2. 問題 何らかの文章(アスキーコード)を換字式暗号で暗号化した 列s = s1 . . . sn が与えられる.復号化せよ. 換字式暗号: before after a p b c c z ... ... z m before: s = “abababc” ⇓ after: T(s) =“pcpcpcz” 対応表T に従って一文字ずつ変換する(T は非公開) 1/ 23
  • 5. 解答例 1. 1文字の出現頻度からスペースと e を特定 2. 2文字組の出現頻度から th と in を特定 3. with っぽい部分から w を特定 4. … 5. ある程度文章らしくなったら Google に投げる このような try-and-error で10分そこらで復号化可能 答え = Sharlock Holmesの一節 データだけからは厳密には判別困難なものもあるので 最後は文脈や出題者の好みなどを考慮する必要あり (例: ” or ’ / Watson or Hatson or Datson) 4/ 23
  • 7. 尤度攻撃 言語中に文字列tが出現する確率P (t)を考える このとき,換字表T のもっともらしさ(尤度)は P (T (s)) 尤度攻撃:尤度最大化問題 maximize P (T (s)) を解くことで,復号化する手法 Q1. 文字列の出現確率P (t)はどう計算(推定)する? Q2. どうやって尤度最適化問題を解く? 6/ 23
  • 9. 言語モデル 本当の文字列出現確率P (t)はもちろん未知 ⇒ 適当にモデル化したもの:(確率的)言語モデル (このモデル化は,自然言語処理の最も基本的な問題の1つ) 典型的なモデル:t = t1t2 · · · tn として • 1-gram モデル:P (t) = p(t1) · p(tn) (各文字が独立にランダム出現する) • 2-gram モデル:P (t) = p(t1)p(t2|t1) · · · p(tn|tn−1) (前の文字に依存してランダム出現する) • その他,さまざまなモデルが存在 今回は2-gramモデルを加法的スムージングしたものを使う 8/ 23
  • 10. 2-gramモデル 「いまの文字に依存して次の文字の確率が決まる」モデル p(a|b):文字bの次に文字aが出現する確率として P (t) = p(t1| )p(t2|t1) · · · p(tn|tn−1) モデルの推定 = p(a|b)を決めること 推定法 • 適当なテキスト(コーパスと呼ばれる)を用意, •「文字bの次に文字aが現れた回数f(a|b)」を数える • 適当に正規化して確率にする: p(a|b) = f(a|b) ∑ c f(a|c) 9/ 23
  • 13. 尤度最大化問題の解き方 maximize P (T (s)) • 言語モデルP (t)は計算できるようになった • 尤度P (T (s))をどうやって最大化する? •「一番良い置換を求めるタイプの問題」の解法 (基本的に NP-hard になるので厳密解法はあきらめる) – 焼きなまし Simulated Annealing – タブーサーチ Tabu Search – 遺伝的アルゴリズム Genetic Algorithm – ... ここではタブーサーチを紹介(他の方法でも解けます) 12/ 23
  • 14. タブーサーチ 1. 適当な初期解T を構成 2. 満足するまで以下を実行 2.1. T から新しい解T1, . . . , Tw を生成 ただし過去h回以内に訪問した解は生成しない 2.2. T1, . . . , Tw のうち最も良い解に移動 3. 訪問した中で最も良いものを出力 • 局所探索法の一種 • 過去h個の記憶(タブーリスト)をもつ • 目的関数が悪化しても,タブーリストに入っていなければ 移動し,局所最適解に落ちるのを防止する 比較的良い解が出れば十分な場合,特に強い 13/ 23
  • 16. 実験 言語モデルのためのコーパスは Project Gutenberg から • The Adventures of Sharlock Holmes, by Arthur Conan Doyle • Alice’s Adventures in Wonderland, by Lewis Carroll • Pride and Prejudice, by Jane Austin • The Old Testament of the King James Version of the Bible をダウンロードして使用,違いを観察する タブーサーチの初期解は頻度から貪欲法で生成 反復数は,解を見ながら適当に打ち切る(1000回程度) 近傍は2要素の交換で生成,近傍サイズw = 100程度 履歴サイズh = 10 程度使用 (計算時間はどれも数十秒∼数分程度) 次頁より,結果を真の解と比較(赤字は復元ミス) 15/ 23
  • 17. by Adventure of Sherlock Holmes ”You see, my dear Watson,” II he propped his testItube in the rack, and began to lecture with the air of a professor addressing his class II ”it is not really difficult to construct a series of inferences, each dependent upon its predecessor and each simple in itself. Of, after doing so, one simply knocks out all the central inferences and presents one’s audience with the startingIpoint and the conclusion, one may produce a startling, though possibly a meretricious, effect. How, it was not really difficult, by an inspection of the groove between your left forefinger and thumb, to feel sure that you did HEA propose to invest your small capital in the gold fields.” 16/ 23
  • 18. by Alice’s Adventure in Wonderland ’You see, my dear Hatson,’ II he propped his testItube in the rack, and began to lecture with the air of a professor addressing his class II ’it is not really difficult to construct a series of inferences, each dependent upon its predecessor and each simple in itself. Af, after doing so, one simply knocks out all the central inferences and presents one-s audience with the startingIpoint and the conclusion, one may produce a startling, though possibly a meretricious, effect. Mow, it was not really difficult, by an inspection of the groove between your left forefinger and thumb, to feel sure that you did MER propose to invest your small capital in the gold fields.’ 17/ 23
  • 19. by Pride and Prejudice ”You see, my dear Datson,” he propped his test tube in the rack, and began to lecture with the air of a professor addressing his class ”it is not really difficult to construct a series of inferences, each dependent upon its predecessor and each simple in itself. If, after doing so, one simply knocks out all the central inferences and presents one’s audience with the starting point and the conclusion, one may produce a startling, though possibly a meretricious, effect. Cow, it was not really difficult, by an inspection of the groove between your left forefinger and thumb, to feel sure that you did CO; propose to invest your small capital in the gold fields.” 18/ 23
  • 20. by The Bible )Gou see, my dear Satson,) 22 he propped his test2tube in the rack, and began to lecture with the air of a professor addressing his class 22 )it is not really difficult to construct a series of inferences, each dependent upon its predecessor and each simple in itself; If, after doing so, one simply knocks out all the central inferences and presents one’s audience with the starting2point and the conclusion, one may produce a startling, though possibly a meretricious, effect; Low, it was not really difficult, by an inspection of the groove between your left forefinger and thumb, to feel sure that you did LOD propose to invest your small capital in the gold fields;) 19/ 23
  • 21. by The Bible; 青地は原文に未出現単語 )Gou see, my dear Satson,) 22 he propped his test2tube in the rack, and began to lecture with the air of a professor addressing his class 22 )it is not really difficult to construct a series of inferences, each dependent upon its predecessor and each simple in itself; If, after doing so, one simply knocks out all the central inferences and presents one’s audience with the starting2point and the conclusion, one may produce a startling, though possibly a meretricious, effect; Low, it was not really difficult, by an inspection of the groove between your left forefinger and thumb, to feel sure that you did LOD propose to invest your small capital in the gold fields;) 20/ 23
  • 22. まとめ 換字式暗号に対する尤度攻撃 • 言語モデルを設定(2-gram + 加法的スムージング) • 尤度を最適化(タブーサーチ) 実験結果 • 人が目で見て読める程度には復元可能 (原文に含まれない単語もなんとかなる) • 固有名詞や記号などは厳しい (文脈や出題者の好みを考える必要あり) 実験に用いたコード https://gist.github.com/spaghetti-source/10002001 21/ 23
  • 23. 参考文献1:言語モデル • 北 研二, 辻井 潤一 (1999): 計算と言語(4)確率的言語モデル, 東大 出版 • http://plata.ar.media.kyoto-u.ac.jp/mori/research/ topics/LM/LM.pdf • http: //chasen.org/~daiti-m/paper/naist-dlec2004-lmodel.pdf 22/ 23
  • 24. 参考文献2:尤度最大化 • W. S. Forsyth and R. Safavi-Naini (1993): Automated cryptanalysis of substitution ciphers. Cryptologia, 17, 4:407-418. 焼き鈍しで換字式暗号を解く • R. Spillman, M. Janssen, B. Nelson, and M. Kepner (1993): Use of a genetic algorithm in the cryptanalysis of simple substitution ciphers. Cryptologia, 17: 1:31-44. 遺伝的アルゴリズム で解く • A. K. Verma, M. Dave, and R. C. Joshi (2007): Genetic algorithm and babu search attack on the mono-alphabetic subsitution cipher in adhoc networks. タブーサーチで解く • P. Diaconis (2009): The markov chain monte carlo revolution. Bulletin of the American Mathematical Society, 46, 2:179-205. MCMCで解く(≒焼きなまし) 23/ 23