Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

[CB19] Shattering the dark: uncovering vulnerabilities of the dark web by Takahiro Yoshimura, Ken-ya Yoshimura

168 views

Published on

The dark web is an anonymized information space -- it conceals you and your visitors -- you cannot distinguish between legitimate visitors and malicious attackers. While many hidden services employ CAPTCHAs hoping to fend off attackers, is it enough? Considering the current situation of Web application vulnerability scanners, it is fairly safe to say CAPTCHAs are marginally sufficient to ward off automated fuzzers or scanners. We don't think so -- we have created a free (as in freedom,) semi-automatic Web vulnerability scanner named Shatter. It enables pentesters to describe targets/analysises/dataflows with code, giving it ability of carrying out comprehensive scans in an automatic and repeatable manner. And it not only can detect issues but also aim to actively exploit ones to take the service over, optionally with external tools such as sqlmap, Metasploit, or custom exploitation script, making it a comprehensive and fierce Web application penetrator. In our session we will use it to breach a certain hidden service requiring CAPTCHAs, exposing some actual vulnerabilities which may lead to identity breaches.

  • Login to see the comments

  • Be the first to like this

[CB19] Shattering the dark: uncovering vulnerabilities of the dark web by Takahiro Yoshimura, Ken-ya Yoshimura

  1. 1. SHATTERING THE DARK CODE BLUE 2019, BLUEBOX 4
  2. 2. TEXT WHO WE ARE ▸ Ken-ya YOSHIMURA (@ad3liae) ▸ Takahiro YOSHIMURA (@alterakey) ▸ Security researchers ▸ Monolith Works Inc. CEO/CTO https://moonlithworks.co.jp/
  3. 3. TEXT WHAT WE DO ▸ Security research and development ▸ iOS/Android Apps →Financial, Games, IoT related, etc. (>200) →trueseeing: Non-decompiling Android Application Vulnerability Scanner [2017] ▸ Windows/Mac/Web/HTML5 Apps →POS, RAD tools etc. ▸ Network/Web penetration testing →PCI-DSS etc. ▸ Search engine reconnaissance (aka. Google Hacking) ▸ Whitebox testing ▸ Forensic analysis ▸ Research →Clairvoyance: concurrent lip reader [2019]
  4. 4. TEXT WHAT WE DO ▸ CTF ▸ Enemy10, Sutegoma2 ▸ METI CTFCJ 2012 Qual.: 1st ▸ METI CTFCJ 2012: 3rd ▸ DEF CON 21 CTF: 6th ▸ DEF CON 22 OpenCTF: 4th ▸ Talks: DEF CON 25 Demo Labs CODE BLUE 2017 DEF CON 27 AI Village etc. DEFCON 2016 by Wiyre Media on flickr, CC-BY 2.0
  5. 5. TEXT RELATED WORKS ▸ Web application vulnerability scanners ▸ Manual: Burp Suite, ZAP etc. ▸ Automatic: WebInspect etc.
  6. 6. TEXT WHAT IS THE DARK WEB? ▸ Anonymized Web on (mostly) Tor ▸ Pure freedom and anarchism ▸ Hard-ish to identify users → CAPTCHAs are often deployed ▸ Traffic routes are randomized → Rather high TTLs Onions by Mike Mozart on flickr, CC-BY 2.0
  7. 7. JOKER’S STASH CASE STUDY #1
  8. 8. TEXT JOKER’S STASH ▸ Fake credit card market?
  9. 9. TEXT PREPARATION - TRADITIONAL ▸ Manual ▸ Crawl and build data flows: Tedious, error-prone, and not repeatable ▸ Automatic ▸ Spider: Not so comprehensive — insufficient coverages
  10. 10. TEXT SHATTER: THE IN-BETWEEN BEAUTY ▸ Our answer: Shatter ▸ Semi-automatic ▸ Repeatable ▸ Comprehensive Shattering by chiaralily on flickr, CC-BY-NC 2.0
  11. 11. TEXT PREPARATION - SHATTER ▸ Manually crawl, mark, and map → “Target maps” ▸ Edit target maps and go ▸ Target maps describe scans ▸ Marked requests will be recognized as “targets” ▸ Data flows are mostly automatically deduced — thus semi-automatic ▸ Same map gives same scan — repeatable Planning by Jeremy Keith on flickr, CC-BY 2.0
  12. 12. TEXT SHATTER TARGET MAP ▸ Are terse and readable YAMLs ▸ Comprised of: ▸ Analysises: What should we do ▸ Sessions: How should we do ▸ Identities: Who should we are ▸ Targets: Whom we approach to ▸ Flows: How we deduce parameters (opt.) ▸ Exploits: What we should do on findings
  13. 13. TEXT ATTACK PLAN / EXECUTE ▸ Data flow map ▸ Flows are wholly deduced ▸ Massive parallel scan →combats high TTLs ▸ Scanner is ZAP-compatible (for now)
  14. 14. DEMO 1 AUTOMATIC EXPLOITATION ATTEMPTS
  15. 15. TEXT AFTERMATH ▸ Insanely old middleware →Automatic exploitation attempt gave 500 ▸ Operator identity: “Evgenij Sokolov”, “Bertrand Rasse”, possibly etc. omerta.sup@gmail.com ▸ Operator works: http://omerta.wf/ etc. ▸ cf. omerta (n) 1: a code of silence practiced by the Mafia; a refusal to give evidence to the police about criminal activities
  16. 16. THE NIGHTMARE CASE STUDY #2
  17. 17. TEXT NIGHTMARE ▸ Black market ▸ Afterwork of Dream Market?
  18. 18. TEXT PREPARATION - TRADITIONAL ▸ CAPTCHA ▸ Potential showstopper
  19. 19. TEXT PREPARATION - SHATTER ▸ CAPTCHA ▸ Parameters can be deduced with code- blocks → NN-based solvers can be attached!
  20. 20. CAPTCHA 102 ▸ Recognizing glyphs in an image ▸ Hard to solve algorithmically ▸ 3-dimensional distortion ▸ Noise
  21. 21. LEARN TO RECOGNIZE ▸ Image classification problem ▸ CNN Convolutional Neural Networks ▸ Supervised learning model ▸ Similar to visual cortex ▸ Good at spatial pattern recog. ▸ Robust against distortions and shifts Typical CNN architecture by Aphex34 on Wikipedia, CC-BY-SA 4.0
  22. 22. LEARN TO RECOGNIZE ▸ For 5-chars: (10+26)5 → 107∼ patterns ▸ Cannot be solved at once ▸ Just classifiers Typical CNN architecture by Aphex34 on Wikipedia, CC-BY-SA 4.0
  23. 23. DIVIDE AND CONQUER ▸ OpenCV2 ▸ De-speckling ▸ Extracting glyphs ▸ Errors due to lack of spacing →ignoring for now
  24. 24. BREACH PLAN ▸ OpenCV2 ▸ Glyph extraction ▸ CNN ▸ Glyph classfication Chess Teacher by JB Kilpatrick on flickr, CC-BY 2.0
  25. 25. BREACH PLAN? ▸ What should we learn? ▸ Synthesized with generators (tag=parameters) ▸ Gathered truths (tag=pre-coordinated truths) Question by Florence Ivy on flickr, CC-BY-ND 2.0
  26. 26. HUMANS TO SAVE US ▸ Anti-Captcha ▸ CAPTCHA recognition service run by humans ▸ Gathered images and tags →Now we can learn ▸ Human powered…? but: ▸ Tedious to recon generators ▸ Of course Shatter can use AC directly
  27. 27. GRAB THEM OUT ▸ Let’s gather CAPTCHAs ▸ We need ~2000 ▸ High RTT! (2~sec..) Grab by Rutger Tuller on flickr, CC-BY 2.0
  28. 28. GRAB THEM OUT! ▸ asyncio super-parallel grabber →No mercy ▸ 2000 imgs / ~48s (24ms/img) ▸ Throughputs are not so bad
  29. 29. READ THEM OUT ▸ Read 2000 CAPTCHAs ▸ Out-of-charset reads ▸ Inaccurate glyph extracts ▸ Take only good reads!
  30. 30. DIVIDE AND CONQUER ▸ OpenCV2 ▸ Shrink, despeckle, expand ▸ Glyph extraction
  31. 31. DIVIDE AND CONQUER ▸ Samples: 6305 ▸ Should be around 10000… but ▸ Dropping glyph mis-extractions ▸ Dropping CAPTCHA mis-reads
  32. 32. RELENTLESS LEARNER ▸ CNN on Keras ▸ N×32x32x1 → 36 ([A-Z0-9]) ▸ Preprocessing ▸ resize and thresholding ▸ Normalization: [0.0f .. 1.0f]
  33. 33. RELENTLESS LEARNER ▸ Keeping effective learning ▸ Small input: 32x32×1 ▸ amsgrad (i.e. modified Adam) ▸ Test dataset ▸ 10% of original dataset ▸ Store the model in HDF5 format →to continuous learning
  34. 34. LEARN TO BREAK ▸ 50 epochs → 30min. Tensorflow 2.0 @ MBP 2017 ▸ GPU? ▸ Keras uses automatically ▸ Only CUDA — MBP falls short :( Early Learner by Aaron Freimark on flickr, CC-BY-ND 2.0
  35. 35. LEARN TO BREAK! ▸ 99% acc. (even in other datasets) →Excellent ▸ Recognizes even Anti-Captcha fails ▸ CNN: should need 500..1000/cls ▸ 175.1/cls in reality ▸ Small dataset :( Early Learner by Aaron Freimark on flickr, CC-BY-ND 2.0
  36. 36. CAPTCHA COMPROMISED ▸ Rarely misses for another dataset
  37. 37. PREPARATION - SHATTER (2) ▸ Attach to target map as a code block ▸ Feed the solver, return the result into the parameter
  38. 38. TEXT ATTACK PLAN / EXECUTE ▸ Data flow map ▸ CAPTCHAs are solved in realtime
  39. 39. DEMO 2 AUTOMATED SCAN, SOLVING MULTIPLE CAPTCHAS
  40. 40. TEXT AFTERMATH (2) ▸ We have breached CAPTCHA protection for Nightmare (again) ▸ Their CAPTCHAs are rather weak (again) No lock 2 by Jens Eilers Bischoff on flickr, CC-BY 2.0
  41. 41. TEXT FREE AS FREEDOM ▸ http://sha.tter.io/ (GitHub repos will be announced there) ▸ AGPL-3: It remains free for good ▸ Currently under heavy workings on fixes and .. ▸ We are striving to make it not only useful but also essential Freedom by Mochamad Arief on flickr, CC-BY-NC-ND 2.0
  42. 42. TEXT CONCLUSION ▸ The dark web ▸ Anonymized Web ▸ Hard to name attackers ▸ CAPTCHAs are often deployed but _not_ effective! ▸ Related works are not sufficient ▸ Automatic: non-comprehensive ▸ Manual: non-repeatable IMG_2988s by 不憂照相館 on flickr, CC-BY-NC-ND 2.0
  43. 43. TEXT CONCLUSION ▸ Our answer: Shatter ▸ Semi-automatic Crawl, mark, map, edit — you do Scan — we do ▸ Repeatable Same map gives the same scan ▸ Comprehensive Because you crawl ▸ Beauty lies in “semi-autonomy” Shattering by chiaralily on flickr, CC-BY-NC 2.0
  44. 44. TEXT CONCLUSION ▸ Shatter can… ▸ Deduce params automatically, or with some code (solving CAPTCHAs, 2FAs, …) ▸ Fingerprint and stage attacks ▸ Actively exploit vulnerabilities ▸ Cooperate with other toolchains to deeper analysis/exploitation Mise en scène nocturne by Jean-François Renaud on flickr, CC-BY-ND 2.0
  45. 45. TEXT CONCLUSION ▸ Shatter is ▸ At: http://sha.tter.io/ (GitHub repos will be announced there) ▸ Under AGPL-3: Free as freedom, for good ▸ Stay tuned! ▸ Under heavy workings on fixes and .. ▸ Should be available at 12/24/2019 Freedom by Mochamad Arief on flickr, CC-BY-NC-ND 2.0
  46. 46. TEXT CONCLUSION ▸ For hidden service operators: ▸ CAPTCHAs are not effective ▸ Better update your stack ▸ If you do bad things, you must be prepared to be exposed Menace by Kilworth Simmonds on flickr, CC-BY-ND 2.0
  47. 47. FIN. 28.10.2019 MONOLITH WORKS INC.

×