Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

red neuronal Som Net

1,832 views

Published on

Published in: Education
  • Login to see the comments

  • Be the first to like this

red neuronal Som Net

  1. 1. A SELF-ORGANIZI NG NEURAL S YS TEM FOR L EARNI NG TO RECOGNI Z E TEXTURED S CENES Stephen Grossberg1 and James R. Will i am 2 son Departm of Cogni ti ve and Neural System ent s and C enter f or Adapti ve Systems Boston Uni versi ty Vision Research , 39 (1999) 1385-1406. All c rr spo de c sh uld be a d e d to o e n ne o d r sse : Prof essor Stephen G rossberg Departm of C ti ve and N ent ogni eural Systems Boston U versi ty ni 677 B eacon Street Boston, MA02215 Phone: 617-353-7858 Fax: 617-353-7755 E-m l : steve@cns. bu. edu ai Keywords: pattern recogni ti on, boundary segm entati on, surf ace representati on,
  2. 2. l l i ng-i n, texture cl assi
  3. 3. cati on neural network, adapti ve resonance theory , 1 Supported in par t by t he Defense Res ear ch Pr oject s Agency and t he Oce of Naval Re s e ar c h (O N00014-95- 1- 0409) and t he O c e of Naval Res ear ch ( ONR N00014- 95- 1- 0657) . NR 2 Suppor t ed i n par t by t he Def ens e Res ear ch Pr o j ect s Agency and t he Oce of Naval Re s e ar c h ( O N00014- 95- 1- 0409) . NR
  4. 4. Abs tr act Asel f -organi zi ng A TE m R X odel i s devel oped to categori ze and cl assi f y textured i m age regi ons. A T Xspeci al i zes the F C D odel of howthe vi sual cortex sees, and the RE A A Em A Tm of howtem R odel poral andpref rontal corti ces i nteract w th the hi ppocam system i pal to l earn vi sual recogni ti on categori es and thei r nam F C D es. A A Eprocessi ng generates a vector of boundary and surf ace properti es, notabl y texture and bri ghtness properti es, by uti l i zi ng m ti -scal e
  5. 5. l teri ng, com ti on, and di usi ve
  6. 6. l l i ng-i n. Its context-sensi ti ve ul peti l ocal m easures of textured scenes can be used to recogni ze sceni c properti es that grad- ual l y change across space, as w l as abrupt texture boundari es. A T i ncrem el R ental l y l earns recogni ti on categori es that cl assi f y F C D A A Eoutput vectors, cl ass nam of these es categori es, and thei r probabi l i ti es. T op-dow expectati ons w thi n A Tencode l earned n i R prototypes that pay attenti on to expected vi sual f eatures. W novel vi sual i nf orm hen a- ti on creates a poor m w th the best exi sti ng category prototype, a m ory search atch i em sel ects a new category w th w ch cl assi f y the novel data. A T X i s com i hi RE pared w th i psychophysi cal data, and i s benchm arked on cl assi
  7. 7. cati on of natural textures and syn- theti c aperture radar i m ages. It outperf orm state-of -the-art system that use rul e-based, s s backpropagati on, and K-nearest nei ghbor cl assi
  8. 8. ers. 1
  9. 9. 1 Introduction 1.1 Ba c kgr o und a n d Be n c hma r k s T brai n's unparal l el ed abi l i ty to percei ve and recogni ze a rapi dl y changi ng w d has he orl i nspi red an i ncreasi ng num of m s ai m at expl oi ti ng these properti es f or purposes ber odel ed of autom c target recogni ti on. On the perceptual si de, the brai n can cope w th vari abl e ati i i l l um nati on l evel s and noi sy sceni c data that com ne i nf orm on about edges, textures, i bi ati shadi ng, and depth that are overl ai d i n al l parts of a scene. T s type of general -purpose hi processi ng enabl es the brai n to deal w th a w de range of i m i i agery, both f am l i ar and i unf am l i ar. O the recogni ti on si de, the brai n can autonom y di scover and l earn i n ousl recogni ti oncategori es and predi cti ve cl assi
  10. 10. cati ons that shape them ves to the stati sti cs sel of a changi ng envi ronm i n real ti m T present arti cl e devel ops a newsel f -organi zi ng ent e. he neural archi tecture that com nes perceptual and recogni ti on m s that exhi bi t these bi odel desi rabl e properti es. These m s have i ndi vi dual l y been deri ved to expl ai n and predi ct data about how odel the brai n generates perceptual representati ons i n the stri ate and prestri ate vi sual cor- ti ces (e. g. , A ngton, 1994; B och G rri al rossberg, 1997; F ranci s G rossberg, 1996; G ove, G rossberg, Mngol l a, 1995; G i rossberg, 1994, 1997; G rossberg, Mngol l a, R i oss, 1997; P essoa, Mngol l a, N ann, 1995) and uses these representati ons to l earn attenti ve i eum recogni ti on categori es and predi cti ons through i nteracti ons betw i nf erotem een poral , pre- f rontal , and hi ppocam corti ces (e. g. , B pal radski G rossberg, 1995; C arpenter G ross- berg, 1993; G rossberg, 1995; G rossberg M l l , 1996). T perceptual theory i n ques- erri he ti on i s cal l ed F C D theory. It consi sts of subsystem cal l ed the B AAE s oundary Contour System(B S) and the F C eature Contour System(FC that generate 3-Dboundary and S) surf ace representati ons that m odel the corti cal i nterbl ob and bl ob processi ng stream s, respecti vel y. T adapti ve categori zati on and predi cti ve theory i s cal l ed A he dapti ve Reso- nance T heory, or A T A Tm s are capabl e of stabl y sel f -organi zi ng thei r recogni ti on R . R odel codes usi ng ei ther unsupervi sed or supervi sed i ncrem ental l earni ng i n any com nati on bi through ti m (C e arpenter G rossberg, 1991; C arpenter et al., 1992). T present w devel ops the A T Xm to cl assi f y scenes that i ncl ude com ex he ork R E odel pl textures, both natural and arti
  11. 11. ci al . T A T Xarchi tecture w bui l t up f romspe- he R E as ci al i zed versi ons of F C D A A Eand A Tm s that have been desi gned to achi eve hi gh R odel com petence i n cl assi f yi ng textured scenes w thout al so i ncorporati ng m i echani sm that s are not essenti al f or understandi ng thi s com petence. Just as the properti es of the F - A C D and A Tm s are em AE R odel ergent properti es that are due to i nteracti ons of thei r vari ous parts, the properti es of the A T Xarchi tecture are al so em RE ergent properti es due to i nteracti ons w thi n and betw i ts F C D i een A A Eand A Tm es. T R odul hese newem ergent properti es are not m y the sumof the parts of the m es of w chthey are deri ved, erel odul hi and need to be anal ysed on thei r ow term n s. Inorder to understandthe emergent properti es that are achi evedby joi ni ng a F C D AAE 2
  12. 12. vi si on preprocessor to an A Tadapti ve cl assi
  13. 13. er, A T Xi s benchm R RE arked agai nst state- of -the-art al ternati ve m s of texture cl assi
  14. 14. cati on. O m stri ki ng resul ts are deri ved odel ur ost throughbenchm studi es that cl assi f y natural textures f romthe B ark rodatz (1966) texture al bum w chi s of ten used as a standardi zedtest of texture cl assi
  15. 15. cati on m s. A T X , hi odel RE benchm em ated the condi ti ons under w ch others benchm arks ul hi arked thei r al gori thms on B rodatz textures. Asi ngl e tri al of on-l i ne i ncrem ental category l earni ng by A T X RE can outperf ormanother l eadi ng m ' s o-l i ne batchl earni ng usi ng a com ex rul e-based odel pl system(G reenspan, 1996; G reenspan et al., 1994). A T Xal so outperf orm K RE s -nearest nei ghbor m s i n both accuracy anddata com odel pressi on, andm ti l ayer perceptrons (back ul propagati on) i n both accuracy and processi ng ti m e. T cl assi
  16. 16. cati on errors that A T Xdoes produce are com he RE pared w th hum per- i an cepti on of texture si m l ari ti es (R Lohse, 1993, 1996). Acorrel ati on exi sts betw i ao een the psychophysi cal l y measured si m l ari ty betw tw textures and the probabi l i ty that i een o A T Xw l l conf use them RE i . A T Xi s al so used to cl assi f y regi ons i n real -w d scenes that have been processed RE orl by syntheti c aperture radar (SA ). SA m R Ri agery has recentl y becom popul ar i n m e any satel l i te i m processi ng appl i cati ons because the SA sensor can penetrate vari abl e age R w eather condi ti ons (N ovak et al., 1990; W an et al., 1995). T SA m present axm he Ri ages a chal l enge f or texture cl assi
  17. 17. ers because they contai n pi xel i ntensi ti es that vary over
  18. 18. ve orders of m tude and are corrupted by hi gh l evel s of m ti pl i cati ve noi se, yi el di ng agni ul i ncom ete and di sconti nuous boundary and surf ace representati ons. R ts bel owon pl esul natural texture and SA m i l l ustrate howpattern recogni ti on m s that are based Ri ages odel on bi ol ogi cal pri nci pl es and m echani sm can outperf ormm s that have been deri ved s odel f romm tradi ti onal engi neeri ng concepts. ore 1 . 2 Ps y c h o ph y s ic a l Da t a a n d Mo d e l Pr o p e r t i e s A l east tw di erent approaches exi st to texture cl assi
  19. 19. cati on. In one approach, the f ocus t o i s on separati ng regi ons w th di erent textures by
  20. 20. ndi ng the boundari es betw them i een (B ergen A son, 1988; F del ogel Sagi , 1989; Gurnsey B se, 1989; M i k P row al erona, 1990; R ubenstei n Sagi , 1990; B ergen Landy, 1991). A nother approach attem topts cl assi f y the textures w thi n sm l regi ons of a scene (C l i , 1985, 1988; B k, C ark, i al ael ovi l G sl er, 1990; Jai n F ei arrokhni a, 1991; Greenspan et al., 1994). Such an approach di scovers texture boundari es by cl assi f yi ng the textures w thi n each regi on di erentl y. It i can al so cl assi f y l ocal regi ons whose textural properti es vary gradual l y across space, and thus are not separated by a di sti nct boundary. Gurnsey and Laundry (1992) have provi ded psychophysi cal data i n support of the l atter type of processi ng by show ng that hum texture recogni ti on i s onl y sl i ghtl y i m i an - pai red w the boundari es betw di erent textures i n a texture m c are bl urred. hen een ozai A T Xdoes the l atter type of cl assi
  21. 21. cati on. It deri ves a 17-di m onal f eature vec- RE ensi tor f romm ti pl e-scal e boundary f eatures of the B S and a surf ace bri ghtness f eature ul C 3
  22. 22. of the FC T s f eature vector uti l i zes
  23. 23. l ters of f our di erent scal es, as suggested by S. hi psychophysi cal experi m (Harvey G ents ervai s, 1978; R chards, 1979; Wl son B i i ergen, 1979). T spati al
  24. 24. l ters are eval uated at f our di erent ori entati ons, thereby l eadi ng to a he 16-di m onal (4 2 4) f eature vector. T 17 di m on i s a surf ace bri ghtness f eature. ensi he th ensi T A T Xm uses these f eature vectors to generate a context-sensi ti ve cl assi
  25. 25. cati on he R E odel of l ocal texture properti es. T hese B S and FC operati ons are desi gned to be as si m e C S pl and f ast as possi bl e w thout i ncurri ng a l oss of accuracy i n cl assi f yi ng texture data. i Al arge psychophysi cal l i terature supports the F C DA A Ehypothesi s that the hum an brai n f orm di sti nct boundary and surf ace representati ons bef ore they are bound together s by obj ect recogni ti on categori es. E xperi mental resul ts that support the rol e of boundary representati ons i ncl ude the f ol l ow ng: (1) O ect superi ori ty eects occur usi ng outl i ne i bj sti m i w th l i ttl e surf ace detai l (D do D ul i avi onnel l y, 1990; H a, H om aver, Schw artz, 1976). (2) T num of errors i n tachi stoscopi c recogni ti on and the speed of i denti
  26. 26. ca- he ber ti on are of ten com parabl e usi ng appropri atel y and i nappropri atel y col ored obj ects (Mal , i Sm th, D i oherty, Sm th, 1979; O i stergaard D do, 1985). (3) T avi here i s no di erence i nrecogni ti on speed usi ng bl ack-and-w te photographs or l i ne draw ngs that are caref ul l y hi i deri ved f romthem(B ederm Ju, 1988). i an Several types of data al so i m i cate a separate surf ace bri ghtness and col or process. pl T hese i ncl ude the f ol l ow ng: (4) C ored surf aces m be bound to an i ncorrect f ormdur- i ol ay i ng i l l usory conj uncti ons (M cLean, B roadbent, Broadbent, 1983; Stef urak Boynton, 1986; T sm Schmdt, 1982). (5) C or can f aci l i tate obj ect nam ng i f the obj ect- rei an i ol i s to be nam are structural l y si m l ar or degraded (C st, 1975; P ce H phreys, ed i hri ri um 1989). (6) C ors are coded categori cal l y pri or to the processi ng stage at w ch they ol hi are nam (D do, 1991; R ed avi osch, 1975). T o of the m recent studi es i n support w ost of the boundary-surf ace di sti ncti on w carri ed out by E der and Zucker (1998) and ere l R ogers-R achandran and R achandran (1998). am am F C D theory proposes that 3-Dboundary and surf ace f eatures that are f orm AAE ed i n the prestri ate vi sual cortex are categori zed i n the i nf erotemporal cortex (Grossberg, 1994, 1997). B boundary and surf ace properti es are proposed to be com ned duri ng oth bi the categori zati on process w thi n bottom and top-dow adapti ve pathw that are i -up n ays m ed by an A Tsystem T o consequences of thi s concepti on are that unam guous odel R . w bi boundari es can generate category recogni ti on by them ves, and that boundari es can sel pri m 3-Dobj ect representati ons even i f they need to be suppl em e ented by 3-Dsurf ace i nf orm on i n order to achi eve unam guous recogni ti on. C ati bi avanagh (1997) has reported data consi stent w th thi s l atter predi cti on. i In the A T Xi m em R E pl entati on of thi s concept, the f eature vectors that are f orm ed f romthe 17-di m onal boundary and surf ace f eatures of the F C D ensi A A Epreprocessor are i nput to an A Tcl assi
  27. 27. er, w ch categori zes the textures usi ng a bi ol ogi cal l y-m vated R hi oti l earni ng al gori thm H ans l earn to di scri m nate textures by l ooki ng at themand be- . um i com ng sensi ti ve to thei r stati sti cal properti es i n sm l regi ons. T s i s howour m i s i al hi odel trai ned. Intui ti vel y speaki ng, m trai ni ng i s l i ke havi ng an observer l ook at a num odel ber 4
  28. 28. of l ocati ons and tryi ng to l earn to categori ze thembased on thei r l ocal properti es. T he A T cl assi
  29. 29. er w used, cal l ed G R e aussi an A T A , or G M i ncrem R MP A, ental l y constructs i nternal categori es that have G aussi an recepti ve
  30. 30. el ds i n the i nput space, and that m ap to output cl ass predi cti ons (Wl l i am 1996, 1997). C l s w th G i son, el i aussi an recepti ve
  31. 31. el ds are ubi qui tous i n the brai n, and have been used to m data about howthe i nf erotem odel - poral cortex l earns to categori ze vi sual i nput patterns (Logotheti s et al., 1994). Such m s are not, how odel ever, typi cal l y abl e to sel f -organi ze thei r ow recogni ti on categori es n and to autonom y search f or new ones w th w ch to cl assi f y novel i nput patterns. ousl i hi A Tm s overcom thi s w R odel e eakness by show ng howcom em i pl entary attenti onal and ori - enti ng system are desi gned w th w ch to bal ance betw the processi ng of f am l i ar and s i hi een i expected events, on the one hand, and unf am l i ar and unexpected events on the other i (C arpenter G rossberg, 1991; G rossberg, 1980; G rossberg M l l , 1996). A l l earned erri l categori zati on goes on w thi n the attenti onal system T ori enti ng subsystemi s acti - i . he vated i n response to events that are too novel f or the attenti onal systemto successf ul l y categori ze them Interacti ons betw the attenti onal and ori enti ng subsystem then l ead . een s to a m ory search w ch di scovers a m appropri ate popul ati on of cel l s w th w ch em hi ore i hi to categori ze the novel i nf orm on. T ati hese i nteracti ons are desi gned to expl ai n howthe brai n conti nues to l earn qui ckl y about huge am ounts of newi nf orm on throughout l i f e, ati w thout bei ng f orced to j ust as qui ckl y f orget usef ul i nf orm on that i t has previ ousl y i ati l earned. A ter each i nput i s presented (i . e. , each l ocati on i s observed), G Mautom cal l y f A ati acti vates cel l s w recepti ve
  32. 32. el ds adapt to represent the i nput by am hose ounts proporti onal to thei r l evel of match w th the i nput. H ever, i f the i nput i s too novel f or any exi sti ng i ow recepti ve
  33. 33. el d to m the i nput w l enough, then a m ory search i s tri ggered w ch atch el em hi l eads to the sel ecti on of a previ ousl y uncom i tted cel l popul ati on w th w ch a newcate- m i hi gory can be l earned. D ng unsupervi sed l earni ng, the correct nam of the regi ons that uri es are bei ng cl assi
  34. 34. ed are not suppl i ed, and the l evel of m that i s requi red f or a category atch to l earn i s constant. T param he eter that determ nes thi s degree of m i atch i s cal l ed the vi gi l ance param because i t com eter putati onal l y real i zes the i ntui ti ve process of bei ng m or l ess vi gi l ant i n respose to i nf orm onof vari abl e i m ore ati portance (C arpenter G ross- berg, 1991). Lowvi gi l ance al l ow the netw to l earn general categori es i n w ch m s ork hi any i nput exem ars m share the sam category prototype. H gh vi gi l ance enabl es the net- pl ay e i w to l earn m speci
  35. 35. c categori es, even categori es i n w chonl y a si ngl e exem ar m ork ore hi pl ay be represented. T the choi ce of vi gi l ance can trade betw prototype and exem ar hus een pl l earni ng, even w thi n a si ngl e A Tsystem E i R . xperi m ental evi dence consi stent w th vi gi - i l ance control has been reported i n m onkeys w they attem to perf ormcl assi
  36. 36. cati ons hen pt duri ng easy vs. di cul t di scri m nati ons (Spi tzer, D m M i esi one, oran, 1988). Learni ng typi cal l y starts w th a l ow vi gi l ance val ue, w ch l eads to the f orm on i hi ati of the m general categori es that are consi stent w th the i nput data. B ost i ecause A T R m s are sel f -organi zi ng, suchl earni ng can proceed on i ts ow n an unsupervi sedm odel ni ode. Starti ng w th a l owvi gi l ance val ue conserves m ory resources, but i t can al so create the i em tendency, al so f ound i n chi l dren, to overgeneral i ze unti l f urther l earni ng l eads to category 5
  37. 37. re
  38. 38. nem (C an, et al., 1986; C ark, 1973; Sm th et al., 1985; Sm th K l er, 1978; ent hapm l i i em W 1983). F exam e, i t m ght happen that, af ter l earni ng a category that cl assi
  39. 39. es ard, or pl i vari ati ons on the l etter E the l etter F w l l al so acti vate that category, based on the , i vi sual si m l ari ty betw the tw types of l etters. T di erence betw the l etters E i een o he een and F i s determ ned by cul tural f actors, not by vi sual si m l ari ty. Supervi sed l earni ng i i i s of ten essenti al to prevent errors based on i nput si m l ari ty w ch do not correspond to i hi cul tural understandi ngs, or other envi ronm ental l y dependent f actors. A Tm s can R odel operate i n both unsupervi sed and supervi sed l earni ng modes, and can sw tch betw the i een tw seam essl y duri ng the course of l earni ng. o l D ng supervi sed l earni ng, the vi gi l ance param uri eter, or requi red m l evel , i s rai sed atch i f an i ncorrect predi cti on i s m (e. g. , i f there i s negati ve rei nf orcem ade ent) by j ust e- nough to tri gger a m ory search f or a new category. T s type of vi gi l ance control em hi sacri
  40. 40. ces category general i ty onl y w m speci
  41. 41. c categori es are needed to m the hen ore atch stati sti cal properti es of a gi ven envi ronm C ent. ategori es of vari abl e general i ty are hereby autom cal l y l earned based upon the success or f ai l ure of previ ousl y l earned categori es ati i n predi cti ng the correct cl assi
  42. 42. cati on. Abl ock di agramof the A T Xarchi tecture i s RE show i n Fi gure 1. n 2 u pl e-scal e Ori en lti ted Fi l ter T A T Xm ti pl e-scal e ori ented
  43. 43. l ter f urther devel ops the B S
  44. 44. l ter that w i ntro- he R E ul C as duced to expl ai n texture data i n Grossberg and Mngol l a (1985). Vari ants of thi s B S i C
  45. 45. l ter have si nce becom standard i n m texture segm e any entati on al gori thm (M i k s al Perona, 1989; Sutter, B eck, G raham 1989; B k et al., 1990; B , ovi ergen, 1991; B ergen Landy, 1991; Jai n F arrokhni a, 1991; Graham B , eck, Sutter, 1992; G reenspan et al., 1994). Fi gure 2 di agram the A T X versi on of B S processi ng (Stages 1{5) f or a si ngl e s RE C spati al scal e. A i n R chards (1979), w used 4 spati al f requency channel s. E chan- s i e ach nel com puted 4 ori entati onal contrast f eatures. T hese
  46. 46. l ter equati ons and param eters are descri bed i n A ppendi x I. Af uncti onal descri pti on i s gi ven here. Stage 1 of the B S C
  47. 47. l ter uses an on-center o-surround netw w ork hose cel l s obey m brane equati ons, or em shunti ng l aw (G s rossberg, 1980, 1983) to di scount the i l l um nant, com contrast ra- i pute ti os of the i m age, and norm i ze i m i ntensi ti es. Stage 2 accom i shes m ti pl e-scal e al age pl ul ori ented
  48. 48. l teri ng usi ng odd-sym etri c G m abor
  49. 49. l ters at the 4 ori entati ons and spati al s- cal es. Stage 3 com putes a l ocal m easure of absol ute ori entati onal contrast by f ul l -w ave recti f yi ng the
  50. 50. l ter acti vi ti es f romStage 2. T hese operati ons are neural l y i nterpreted as f ol l ow Stage 1 operati ons occur i n the reti na and LG , Stage 2 operati ons at corti - s: N cal si m e cel l s, and Stage 3 operati ons at corti cal com ex cel l s (G pl pl rossberg Mngol l a, i 1985). Stage 4 si m i
  51. 51. es the B S operati ons of boundary groupi ng by com ng a s- pl C puti m ooth, rel i abl e measure of ori entati onal contrast that spati al l y pool s responses w thi n the i sam ori entati on. Stage 5 perf orm an opti onal ori entati onal i nvari ance operati on w ch e s hi 6
  52. 52. ARTEX System Output Prediction of Region Type Gaussian ARTMAP Classifier Multiple Scale BCS Single Scale FCS Orientational Contrast Surface Brightness Features Feature Discount Illuminant Input Image Figre 1 Blo d g m o A u : ck iara f R TEX ime cla tio b stem a ssi
  53. 53. ca nsusy s. g 7
  54. 54. shi fts ori entati onal responses at each scal e i nto a canoni cal orderi ng. T s com hi putati on shi fts, w th w around, the sm i rap oothed ori entati onal responses fromStage 4 so that the ori entati on w th m m am i tude i s i n the
  55. 55. rst ori entati on pl ane. T useful ness of i axi al pl he thi s operati on i s task-dependent, as show by our si m ati ons bel ow n ul . G rahamet al . (1992) al so si m i
  56. 56. ed Stage 4 of the B S by pool i ng responses from pl C Stage 3. T then used a hand-crafted si gm dal di scri mnati on m hey oi i easure to convert Stage 4 output i nto a probabi l i sti c output functi onthat coul d be comparedw thsubjects' i rati ngs of texture di scri mnabil i ty. In the present benchm studi es, the B S
  57. 57. l ter i ark C outputs form part of the i nput vector to a GMcl assi
  58. 58. er w ch autonom y l earns s A hi ousl the probabi l i sti c recogni ti on categori es w th w ch texture di scri mnati ons are m W i hi i ade. e note i nSecti on3 howthe G raham al . (1992) study has beenextendedto expl ai na l arger et data base about texture di scri mnati on usi ng addi ti onal F C D i A A Etheory m echani sm s. 3 Fi l l e d - i n Su r fa c e r i ght n e s s T F C D odel suggests howthe B S and F S i nteract to generate
  59. 59. l l ed-i n 3-D he A A Em C C surface representati ons w thi n the F S. T surface representati ons are deri ved from i C hese sceni c data after the i l l umnant has been di scounted, as i nStage 1 of F gure 2. In general , i i these surface representati ons com ne i nform on about bri ghtness, col or, depth, and bi ati form O si m ati ons bel owdem . ur ul onstrate the uti l i ty of usi ng a
  60. 60. l l ed-i nsurface bri ghtness feature to hel p l earn recogni ti on categori es for texture di scri mnati on. i T si m est surface feature i s one that i s based on
  61. 61. rst-order di erences i n i l l um- he pl i nati on i ntensi ty. A i m n proved surface feature di scounts the i l l umnant to com a i pute m easure of l ocal contrast. Sucha feature, how ever, can sti l l be corrupted by vari ous sorts of specul ar noi se i nani m Inthe brai n, suchnoi se canbe due to the bl i ndspot, reti nal age. vei ns, and the reti nal l ayers through w ch l i ght m pass to acti vate photodetectors. hi ust In arti
  62. 62. ci al sensors, too, such noi se can deri ve fromsensor characteri sti cs. D scounti ng i the i l l umnant i s al so i nsensi ti ve to contextual groupi ngs of i m features. A
  63. 63. l l ed-i n i age surface bri ghtness feature overcom these de
  64. 64. ci enci es by sm ng l ocal contrast val - es oothi ues w they bel ong to the sam regi on, w l e m ntai ni ng contrast di erences w hen e hi ai hen they bel ong to di erent regi ons. F l l i ng-i n hereby sm i oothes over i m noi se i n a form age - sensi ti ve w and generates a representati onthat re ects properti es of a regi on' s formby ay, bei ng contai ned w thi n the regi on boundari es. It al so tends to m mze the separabi l i ty, i axi i i n bri ghtness space, of di erent regi on types by mni mzi ng w thi n-regi on vari ance w l e i i i hi m mzi ng betw axi i een-regi on vari ance. T s sort of preattenti ve and autom c separati on hi ati si m i
  65. 65. es the task of an attenti ve pattern cl assi
  66. 66. er such as GM pl A. In Grossberg et al . (1995), a m ti pl e-scal e F C D ul A A Enetw w devel oped to pro- ork as cess noi sy SA i m for use by hum operators. T R ages an here the goal w to generate as reconstructi ons of SA m that w pl easi ng to the eyes of expert photoi nterpreter- Ri ages ere s. T B S i n thi s si m ati on used a groupi ng netw w th a feedback process that he C ul ork i 8
  67. 67. Gaussian ARTMAP OV OI Texture Processing Boundary Processing 5: Orientational 8: Sum Across Invariance Orientations 4: Spatial 7: Half−Wave Pooling Rectification 3: Full−Wave 6: Center−surround Rectification Processing 2: Orientational Filtering Surface Processing 1: Center−surround 9: Boundary−Gated Processing Diffusion Input Image Fi g u r e 2 : Bo unda r y a nd s ur f a c e pr e pr o c e s s i ng s t a g e s . OV = o r i e nt a t i o n a l l y va r i a n t OI = o r i e n t a t i o na l l y i nva r i a nt r e pr e s e nt a t i o n. Ei t he r OV o r OI , bu t n o t b o t h , a r e g i v e n p r o b l e m. 9
  68. 68. can com ete and sharpen boundary representati ons. T pl hese boundary groupi ngs created sharpl y del i neated i m regi ons and
  69. 69. l l ed-i n surf aces. A though such a f eedback group- age l i ng netw has the rem ork arkabl e property of convergi ng w thi n 1 to 3 f eedback i terati ons, i i t sti l l has the di sadvantage, at l east i n sof tw si m ati ons, of sl ow ng dow processi ng are ul i n ti me. H w repl ace the f ul l B S
  70. 70. l ter and groupi ng netw by a m ti pl e-scal e B S ere e C ork ul C
  71. 71. l ter and a si ngl e scal e of one-pass f eedf orw boundary processi ng to control
  72. 72. l l i ng-i n ard of the bri ghtness f eature. C puter si m ati ons sum ari zed bel owdem om ul m onstrate that thi s si m i
  73. 73. cati on does not i m r cl assi
  74. 74. cati on benchm on B pl pai arks rodatz textures and on SA R textured scenes. T si m i
  75. 75. ed boundary segm he pl entati on i s, moreover, computati onal l y 75 ti m f aster than the f eedback netw es ork. T sl ow f eedback benchm are not reported he er arks here. A ccurate texture cl assi
  76. 76. cati on thus does not seemto depend upon photoreal i smof the correspondi ng percept. Stages 6{9 of Fi gure 2 showhowthe B S
  77. 77. l ter output i s used C to deri ve the one-pass boundary segm entati on. A ppendi x II contai ns the equati ons and param eters of thi s si m i
  78. 78. ed bri ghtness
  79. 79. l l i ng-i n process. pl These F C D A A Epreprocessi ng resul ts can be pl aced i nto a l arger f ram ork to better ew understand thei r rel evance f or understandi ng hum texture di scri m nati on. T an i hree i ssues need to be consi dered: (1) the use of a si m i
  80. 80. edStage 4 spati al pool i ng operati on i nstead pl of l ong-range groupi ng by a f eedback netw ork; (2) the rol e of surf ace representati ons; and (3) the need f or 3-Dboundary and surf ace representati ons. W are l ong-range hen groupi ngs, such as i l l usory contours, not needed to i m prove texture di scri m nabi l i ty? i T s i s m true w the i m contai n dense enough textures to obvi ate the need f or hi ore hen ages groupi ng over l ong di stances. N al l of the data consi dered even by G ot rahamet al . (1992) w of thi s type, how ere ever, si nce thei r di spl ays contai ned regul arl y pl aced f eatures that coul d group together i n ori entati ons col i near, perpendi cul ar, or obl i que to thei r de
  81. 81. ni ng edges. C ruthi rds et al . (1993) show that a m ti pl e-scal e B S
  82. 82. l ter, suppl em ed ul C ented by the l ong-range groupi ngs of a f eedback netw ork, coul d si m ate the pai rw se orderi ng of ul i hum rati ngs of texture di scri m nabi l i ty better than the G an i rahamet al . (1992) vari ant of the B S
  83. 83. l ter on i ts ow C n. Grossberg and P essoa (1997) have si m ated a vari ant of F C D theory i n w ch ul AAE hi both 2-Dand 3-Dboundary and surf ace operati ons w needed to si m ate psychophys- ere ul i cal data about the di scri m nati on of textured regi ons com i posed of regul ar arrays of equi l um nant col ored regi ons on backgrounds of vari abl e l um nance, as i n the experi - i i m of B (1994) and P ents eck essoa, Beck, Mngol l a (1996). T s l atter si m ati on study i hi ul w restri cted, how as ever, to textures com posed of col ored squares on achrom c back- ati grounds, rather than the stochasti c f actors that ari se i n Brodatz and SA Rtextures. T he G rossberg and P essoa (1997) study al so does not anal yze howrecogni ti on categori es f or di scri mnati ng textures are l earned. T i aken together, how ever, these several studi es pro- vi de convergi ng evi dence that F C D m A A E echani sm can expl ai n chal l engi ng properti es s of data concerni ng hum texture segregati on. an 10
  84. 84. 4 euri s t i cs T 16-di m onal f eature vector produced by Stages 1{5 (representi ng ori entati onal he ensi contrast at 4 ori entati ons and 4 spati al scal es) and the si ngl e
  85. 85. l l ed-i n bri ghtness f eature produced by Stages 6{9 yi el d a 17-di m onal boundary-surf ace f eature vector. G M ensi A m l earna m ng f romthe i nput space popul atedby these f eature vectors to a di screte ust appi output space of associ ated regi on cl ass l abel s. A noted above, G Mshares a num of s A ber key properti es w th other A T A i R MParchi tectures (C arpenter, G rossberg, and R eynol ds, 1991; C arpenter et al . , 1992). G Ml earns m ngs i ncrem A appi ental l y, w thout any pri or i know edge of the probl emdom n, by sel f -organi zi ng an e ent set of recogni ti on cate- l ai ci gori es that shape them ves to the stati sti cs of the i nput envi ronm as w l as a m sel ent, el ap f romrecogni ti on categori es to cl ass l abel s, w ch are suppl i ed duri ng supervi sed l earni ng. hi B ecause G Ml earns i ts m ngs i ncrem A appi ental l y, a previ ousl y trai ned G Mnetw m A ork ay be retrai ned w th new i nput/output conti ngenci es, i ncl udi ng new cl ass l abel s, w thout i i any need to retrai n the netw on the previ ous data. Fi nal l y, al though G Mi s trai ned ork A onl y w th i ndi vi dual cl ass l abel s, i t al so l earns to accuratel y esti m the probabi l i ti es of i ate i ts cl ass l abel predi cti ons, as w showi n our si m ati ons bel ow e ul . In a typi cal A T netw (C R ork arpenter G rossberg, 1987, 1991), an i nput vector acti vates f eature sel ecti ve cel l s w thi n the attenti onal systemthat store the vector i n i short-termm ory. T s short-termm ory pattern then acti vates bottom pathw em hi em -up ays w hose si gnal s are
  86. 86. l tered by l earned adapti ve w ghts, or l ong-termm ory traces. T ei em he
  87. 87. l tered si gnal s are added up at target category nodes w ch com vi a recurrent l ateral hi pete i nhi bi ti onto determ ne w chcategory acti vi ti es w l l be stored i n short-termm ory and i hi i em thereby represent the i nput vector. T degree of acti vati on of a category provi des an he esti m of the l i kel i hood that an i nput bel ongs to the category. A vati ng a category i s ate cti l i ke m ng a hypothesi s. aki A they are bei ng acti vated, the sel ected categori es read-out l earnedtop-dow s nexpecta- ti ons, or prototypes, w ch are m hi atched agai nst the i nput vector at the f eature detectors. T sm hi atchi ng process pl ays the rol e of testi ng the hypothesi s. T vi gi l ance param he eter de
  88. 88. nes the cri teri on f or a good enough match. A noted above, l owvi gi l ance l eads to the s l earni ng of general categori es, w hereas hi gh vi gi l ance l eads to the l earni ng of speci al i zed categori es, evena si ngl e exem ar, i nthe l i m t of very hi ghvi gi l ance. B varyi ng vi gi l ance, pl i y an A Tsystemcan hereby l earn both abstract prototypes and concrete exem ars. R pl If the chosen category' s m atch f uncti on exceeds the vi gi l ance param eter, then the bottom and top-dow exchange of f eedback si gnal s l ocks the systemi nto a resonant -up n state. T resonant state si gni
  89. 89. es that the hypothesi s m he atches the data w l enough to be el accepted by the system A Tproposes that these resonant states f ocus attenti on upon . R rel evant f eature com nati ons, and that onl y resonant states enter consci ous aw bi areness (Grossberg, 1980). R esonance tri ggers l earni ng i n both the bottom adapti ve w ghts -up ei that are used to acti vate the sel ected recogni ti on category, and i n the top-dow w ghts n ei that represent i ts prototype. T s l earni ng i ncorporates the newi nf orm on suppl i ed by hi ati 11
  90. 90. the i nput vector i nto the l ong-termm ory of the attenti onal system em . If the category' s m atch f uncti on does not exceed vi gi l ance, thi s desi gnates that the hypothesi s i s too novel to be i ncorporated i nto the prototype of the acti ve category. A bout of m ory search, or hypothesi s testi ng, i s then tri ggered through acti vati on of the em ori enti ng system M ory search ei ther di scovers a category that can better represent . em the data or, i f no such l earned category al ready exi sts, autom cal l y chooses uncom i t- ati m ted cel l s w th w ch to l earn a new category. A Thereby i ncrem i hi R ental l y di scovers new categori es w hose degree of general i zati on vari es i nversel y w th the si ze of the vi gi l ance i param eter. N eurobi ol ogi cal data about recogni ti on l earni ng i n i nf erotem poral cortex that are consi stent w th these hypotheses are revi ew i edby C arpenter and G rossberg (1993) and Grossberg and M l l (1996). erri A l of the above properti es proceed autonom y i n A Tnetw l ousl R orks as they undergo unsupervi sed l earni ng. A T A R MPextends these A Tdesi gns to i ncl ude both supervi sed R and unsupervi sed l earni ng (C arpenter, G rossberg, R eynol ds, 1991; C arpenter et al . , 1992). In A T A , the chosen A Tcategori es l earn to m predi cti ons w ch take the R MP R ake hi f ormof m ngs to the nam of output cl asses. In such an A T A system m appi es R MP , any di erent recogni ti on categori es can al l l earn to m i nto the sam output nam m as ap e e, uch m di erent vi sual f onts of a gi ven l etter of the al phabet can be grouped i nto several any di erent vi sual recogni ti on categori es, based upon vi sual si m l ari ty, bef ore these vi sual i categori es are m apped i nto the sam audi tory category that i s used to nam that l etter. e e AT A R MPsystem propose howto correct a predi cti on, as i n the case w s here the l etter E i s di scon
  91. 91. rm by envi ronm ed ental f eedback that the correct l etter i s F, usi ng onl y l ocal operati ons i n envi ronm that m be
  92. 92. l l ed w th unexpected events. A T A ents ay i R MP does thi s usi ng a m ni m l earni ng pri nci pl e, w ch conj oi ntl y m m zes predi cti ve gen- i ax hi axi i eral i zati on w l e i t m ni m zes predi cti ve error. A T A hi i i R MPdoes thi s by tryi ng to f ormthe l argest categori es that are consi stent w th envi ronm i ental f eedback. Am ch t racki ng pro- at cess real i zes thi s pri nci pl e by i ncreasi ng the vi gi l ance val ue af ter eachdi scon
  93. 93. rm onunti l ati i t exceeds the chosen category' s m f uncti on. T s vi gi l ance i ncrease i s the m ni m atch hi i al one that can tri gger newhypothesi s testi ng on that l earni ng tri al . M tracki ng hereby atch gi ves up the m ni m ount of general i zati on that i s requi red to correct the error. In i umam sum ary, an A T A m R MPsystemorgani zes i ts categori zati on of experi ence based both on the si m l ari ty of the i nput f eature vectors and upon f eedback f romthe envi ronm i ental response, w hether cul tural l y or otherw se determ ned, to the nam or other behavi ors i i es that i ts categori es predi ct. 5 aus s i an P Gaussi an A T(Wl l i am 1996, 1997) provi des a m f or an A Tsystemto l earn R i son, eans R the stati sti cs of an i nput envi ronm E of i ts categori es de
  94. 94. nes a G ent. ach aussi an di stri bu- ti on i n the i nput space, w th a m and vari ance i n each i nput di m on, as w l as an i ean ensi el 12
  95. 95. overal l a pri ori probabi l i ty. T G he aussi an A Tbottom acti vati on f uncti on eval uates R -up the probabi l i ty that the i nput bel ongs to a category, gi ven i ts Gaussi an di stri buti on and a pri ori probabi l i ty. The m f uncti on eval uates howw l the i nput
  96. 96. ts the category' s atch el di stri buti on, w ch i s norm i zed to a uni t hei ght. T s m i s a m hi al hi atch easure of the di s- tance, i n uni ts of standard devi ati on, betw the i nput vector and the category' s m een ean. V gi l ance speci
  97. 97. es the m m l ow e si ze of thi s di stance. i axi umal abl G aussi an A T al so uses di stri buted l earni ng, i n w ch m ti pl e categori es can al l R hi ul cooperate to cl assi f y an i nput event. G aussi an A Thereby avoi ds the probl em i ncurred R s by grandm other cel l m s of recogni ti on. E such category i s assi gned credi t based odel ach on i ts proporti on of the net acti vati on, w ch i s determ ned by al l categori es w hi i hose m atch f uncti ons sati sf y the vi gi l ance cri teri on. E category then l earns by an am that ach ount i s determ ned by i ts credi t. W G i hen aussi an A T i s extended to G R aussi an A T A R MP to enabl e i t to bene
  98. 98. t f romboth supervi sed and unsupervi sed l earni ng, each category' s credi t i s determ ned by i ts proporti on of the net acti vati on of i ts ensem e, w ch consi sts i bl hi of al l categori es that m to the sam output predi cti on. T norm i zed strength of ap e he al each ensem e' s predi cti on i s a probabi l i ty esti m f or that predi cti on. T equati ons bl ate he and param eters f or G aussi an A T A R MPare f ound i n A ppendi x III. 6 Some l t ernat i ve ext ure l as s i ers 6.1 Cm no F re E opariso f eatu xtractio eth s nM od In order to eval uate the prom se of any vi si on system parti cul arl y one that attem to i , pts expl ai n such a com ex com pl petence as textured scene cl assi
  99. 99. cati on, one needs to eval uate that i t real l y works. T s i s parti cul arl y the case w the key behavi oral properti es hi hen em due to i nteracti ons across the enti re system T i s thus no substi tute f or runni ng erge . here such a systemon benchm arks on w ch com ng system have al so been eval uated. hi peti s O benchm com sons, presented i n Secti on 7, eval uate A T Xunder condi ti ons ur ark pari RE that are as si m l ar as possi bl e to those under w ch these com ng system have been i hi peti s eval uated. A T Xperf orm i s
  100. 100. rst com RE ance pared to that of a systemthat w used to cl assi f y as natural textures i n G reenspan et al . (1994) and G reenspan (1996). W cal l thei r m e odel the H d Systembecause i t i s a hybri d archi tecture that used a l og-G ybri abor G aussi an pyram d f or f eature extracti on f ol l ow by one of three al ternati ve cl assi
  101. 101. ers. A though i ed l the H d Systemw not devel oped to expl ai n bi ol ogi cal data, i t has the vi rtue of ybri as havi ng been devel oped to the poi nt that i t coul d be successf ul l y tested on benchm ark data bases that use textures or textured scenes as thei r i nputs. M other bi ol ogi cal l y ost deri ved m s have not yet reached thi s l evel of devel opm odel ent. T H d System s l og-G he ybri ' abor pyramd uses three l evel s, or spati al scal es, and f our i ori entati ons at each scal e. E l evel , af ter the
  102. 102. rst one, of the G ach aussi an pyramd i s i 13
  103. 103. obtai ned by bl urri ng the previ ous l ow l evel (i . e. , sm l er spati al scal e) w th a G er al i aussi an kernel (w th standard devi ati on = 1) and then deci m ng the i m (i . e. , rem ng i ati age ovi 3 out of 4 pi xel s i n each 2x2 pi xel bl ock). D to deci m on, the G ue ati aussi an at each successi ve l evel eecti vel y has tw ce the of the G i aussi an used i n the previ ous l evel . The
  104. 104. nal outputs of al l three pyram d l evel s of the H d Systemhave the sam net am i ybri e ount of bl urri ng, produced by three successi ve bl ur/deci m steps. T s am of bl urri ng ate hi ount i s equi val ent to convol vi ng w th a si ngl e G i aussi an kernel w th = i 21 = 12 + 22 +4 2 , w ch produces an 8x8 pi xel resol uti on. T i s, each patch of 8 2 8 pi xel s i n the i nput hi hat i m yi el ds a si ngl e pi xel i n an output i m f or each ori ented contrast f eature. In age age G reenspan (1996), cl assi
  105. 105. cati on resul ts at 16 2 16, 32 2 32, and 64 2 64 resol uti on w ere al so reported. Wthout f urther preprocessi ng, A T Xproduces f eature i m at si ngl e pi xel reso- i RE ages l uti on. T m a f ai r com son w th the resul ts reported by G o ake pari i reenspan et al . (1994) and G reenspan (1996), A T Xf eature i m need to be reduced, vi a bl urri ng and dec- RE ages i m on, to the sam resol uti on used there. F exam e, to change the A T Xf eatures ati e or pl RE to 8 2 8 resol uti on, the sm l er-scal e A T Xf eatures requi re addi ti onal bl urri ng pri or to al RE deci m on so that thei r net am of bl urri ng i s equi val ent to convol vi ng w th a si ngl e ati ount i G aussi an kernel w th = i 21. T net am of bl urri ng i s a cruci al consi derati onf or the tw types of tasks onw ch he ount o hi the system are com s pared. T
  106. 106. rst task i s cl assi
  107. 107. cati on of a l i brary of texture i m he ages. B ecause thi s task does not i ncl ude transi ti ons betw di erent textures, perf orm een ance m onotoni cal l y i mproves as bl urri ng i s i ncreased, si nce bl urri ng reduces vari ance and thus im proves the si gnal -to-noi se rati o. T second task i s cl assi
  108. 108. cati on of a texture m c. he osai H ere, texture transi ti ons need to be accuratel y resol ved, so perf orm degrades w th ance i over-bl urri ng. W dem e onstrate both of these phenom bel ow ena . 6.2 Cm no C opariso f lassi
  109. 109. catio eth s nM od In the H d System s
  110. 110. rst cl assi
  111. 111. cati on schem the extracted f eatures are cl ustered ybri ' e, i ndependentl y i n each f eature di m on usi ng the K eans procedure. M ngs f rom ensi -m appi these cl usters to cl ass l abel s are then f orm usi ng a batch l earni ng, rul e-based al gori thm ed cal l ed IT U R LE(G an, et al . , 1992). T cl usters i n thi s schem are f orm to di s- oodm he e ed creti ze the i nput, so that IT U can f ormexpl i ci t rul es m ng themto the output R LE appi cl asses. IT U f orm a l arge num of rul es. T exact num i s never stated i n R LE s ber he ber G reenspan (1996). O the l arge probl em how n s, ever, a m m axi umof 10, 000 i s al l ow and ed, as m as 430 rul es per cl ass are reported f or di scri m nati ng onl y tw textures. A any i o noth- er draw back of thi s approach i s that unsupervi sed di screti zati on vi a K eans cl usteri ng -m throw aw potenti al l y i m s ay portant i nf orm on because the cl usters m span di scri m - ati ay i nati on boundari es i n the i nput space. Fi nal l y, G Menj oys a m or practi cal advantage A aj i n that i t uses a si m e i ncrem pl ental l earni ng procedure as opposed to the com ex and pl com putati onal l y expensi ve batch l earni ng procedure used by IT U . R LE 14
  112. 112. T tw al ternati ve cl assi
  113. 113. ers used i n G he o reenspan (1996) are standard i ncrem ental l earni ng schem the K es: -nearest nei ghbor (K N cl assi
  114. 114. er and the m ti l ayer percep- -N ) ul tron (M ), backpropagati on al gori thm T LP . hese tw approaches have com em o pl entary advantages and aw K N l earns qui ckl y (one trai ni ng epoch) but achi eves no data s. -N com pressi on. M , on the other hand, achi eves better data com LP pressi on but l earns very sl ow y (500 sl ow earni ng trai ni ng epochs i n G l -l reenspan, 1996). A addi ti onal draw n back of M i s that i t uses a f ormof m sm l earni ng that m suer f romcatastrophi c LP i atch ay f orgetti ng i f trai ned on new data w th di erent conti ngenci es f romprevi ous data. A i s dem onstrated by our resul ts bel ow G Mcom nes the good properti es of the above three , A bi cl assi
  115. 115. ers: l i ke IT U , G Mpredi cts the posteri or probabi l i ti es of the output cl asses; R LE A l i ke K N G Ml earns l ocal m ngs qui ckl y; l i ke M , G Machi eves si gni
  116. 116. cant data -N , A appi LP A com pressi on. A though G Muse a m l ocal representati on than M , and thus coul d, l A ore LP i n pri nci pl e, requi re m m ory, G Mcom ore em A pensates f or thi s by constructi vel y f orm ng i a representati on of appropri ate si ze f or w hatever probl emi t i s trai ned on. ext ure l as s i cat i on es ul t s 7.1 1 ex re L rary 0-T tu ib A T Xw
  117. 117. rst com R E as pared to the H d Systemon the l i brary of ten textures show i n ybri n Fi gure 3A w , hose top rowcontai ns structured textures and w hose bottomrowcontai ns unstructured textures. E texture i m consi sts of 128 2 128 pi xel s. T ach age hree other im ages of each texture are not show In G n. reenspan (1996), cl assi
  118. 118. cati on resul ts of the H d Systemusi ng IT U , K N and M cl assi
  119. 119. ers w publ i shed f or thi s ybri R LE -N , LP ere database. T cl assi
  120. 120. ers w trai ned on data at three di erent l evel s of spati al resol uti on, he ere w th a di erent num of trai ni ng sam es per cl ass at each resol uti on: 300 sam es at 8 i ber pl pl 2 8 resol uti on, 125 sam es at 16 2 16 resol uti on, and 40 sam es at 32 2 32 resol uti on. pl pl A T Xw trai ned on the sam data set under the sam condi ti ons. Li ke the H d R E as e e ybri System A T Xused an ori entati onal l y vari ant, or O , representati on on thi s probl em , RE V si nce general i zati on to novel ori entati ons of the sam texture duri ng testi ng w not e as requi red. A T Xw eval uated w th
  121. 121. ve randomorderi ngs of the data, and the resul ts R E as i w averaged. ere T e 1 show com abl s parati ve resul ts f or the H d Systemand A T Xat the three ybri RE spati al resol uti ons. T e 1 l i sts the cl assi
  122. 122. cati on rate, num of epochs, and num abl ber ber of categori es (or hi dden uni ts, stored exem ars, etc. ) f or each systemcon
  123. 123. gurati on. pl T num of epochs i ndi cates howm trai ni ng tri al s w needed. T num of he ber any ere he ber categori es i ndi cate howw l the m com el odel presses the data. In the case of K N there -N , i s no compressi on, so each i nput or exem ar f orm a di erent category. T num pl s he ber of w ghts i ndi cate the m ory resources, or com ei em putati onal com exi ty, that i s needed pl to achi eve thi s degree of com pressi on. T goal i s to m ni m ze the num of epochs, he i i ber categori es, and w ghts. 60 hi dden uni ts are l i sted f or M because the average M ei LP LP 15
  124. 124. Fi g u r e 3 : ( Ne xt p a g e ) . ) 1 0 - t e x t ur e da t a ba s e o f t e x t ur e s c o r r e s p o n d i n g t o Fi g u r e 2 e t a l . ( 1 9 9 4 ) . To p r o w c o ns i s t s o f s t r uc t ur e d t e x t ur e s , a nd b o t t o m r o w o f u n s t r u c t Te x t ur e s f r o mBr o da t z a l buma r e l a b e l e d wi t h pl a t e numb e r . To p r o w ( l e f t t o r i g h t ) : h e r r i n g b o ne we a ve ( D1 7 ) , f r e nc h c a nva s ( D2 1 ) , c o t t o n c a nva s ( D7 7 ) , j e a n s . Bo t t o m r i g ht ) : g r a s s ( D9 ) , pr e s s e d c o r k ( D4 ) , ha ndma de pa p e r ( D5 7 ) , pi g s k i n ( D9 2 ) , a nd wo 4 2 - t e x t ur e d a t a b a s e f r o m Br o da t z a l bum. RO 1 : r e pt i l e s k i n ( D3 ) , c o r k ( D4 ) , wi r e ( D9 ) , b a r k ( D1 2 ) , s t r a w ( D1 5 ) . RO 2 : he r r i ng b o ne ( D1 7 ) , wo o l ( D1 9 ) , f r e nc h c a nva s ( D2 4 ) , s a nd ( D2 9 ) , wa t e r ( D3 8 ) . RO 3 : s t r a w ma t t i ng ( D5 5 ) , ha ndma de pa p e r ( D5 7 ( D6 8 ) , c o t t o n c a nva s ( D7 7 ) , r a a l o o p e d ( D8 4 ) , pi g s k i n ( D9 2 ) . RO 4 : f u r ( D9 3 ) , s k i n ( D1 0 ) , h o me s pun wo o l ( D1 1 ) , r a a we a v e ( D1 8 ) , c e r a mi c br i c k ( D2 6 ) , ne t t i ng ( D 5 : l i z a r d s k i n ( D3 6 ) , s t r a w s c r e e ni ng ( D4 9 ) , r a a wo ve n ( D5 0 ) , o r i e nt a l c l o t h ( c l o t h ( D5 3 ) , o r i e n t a l r a t t a n ( D6 5 ) . RO 6 : pl a s t i c p e l l e t s ( D6 6 ) , o r i e nt a l g r a o r i e n t a l c l o t h ( D7 8 ) , o r i e nt a l c l o t h ( D8 0 ) , o r i e nt a l c l o t h ( D8 2 ) , wo ve n ma t t i n g s t r a w ma t t i ng ( D8 5 ) , s e a f a n ( D8 7 ) , br i c k ( D9 5 ) , bur l a p ( D1 0 3 ) , c he e s e c l o t h ( D1 0 5 ( D1 1 0 ) . 16
  125. 125. A) B) 17
  126. 126. 1 -T tu P b 0 ex re ro lem Con
  127. 127. gurati on C ass. R l ate Sam es/C ass pl l Epochs Categori es W ghts ei 8 2 8R lu n eso tio : H d System IT U ybri , R LE 94. 3 300 Batch | | H d System M ybri , LP 94. 5 300 500 60 1, 500 H d System K N ybri , -N 87. 0 300 1 3, 000 48, 000 A T X al l f eatures RE , 95. 8 300 1 26. 6 958 A T X al l f eatures RE , 96. 3 300 5 34. 0 1, 224 A T X no l arge-scal e f eatures RE , 97. 1 300 5 41. 0 1, 148 A T X no bri ghtness f eature RE , 95. 6 300 5 38. 4 1, 306 A T X no l arge-scal e or RE , 95. 7 300 5 47. 2 1, 227 bri ghtness f eatures 1 2 1 eso tio : 6 6R lu n H d System IT U ybri , R LE 95. 0 125 Batch | | H d System M ybri , LP 96. 0 125 500 60 1, 500 H d System K N ybri , -N 93. 0 125 1 1, 250 20, 000 A T X al l f eatures RE , 97. 2 125 1 17. 4 626 32 2 3 eso tio : 2R lu n H d System IT U ybri , R LE 97. 8 40 Batch | | H d System M ybri , LP 100. 0 40 500 60 1, 500 H d System K N ybri , -N 99. 0 40 1 400 6, 400 A T X al l f eatures RE , 100. 0 40 1 10. 6 382 Ta b l e 1 : Re c o g ni t i o n s t a t i s t i c s o n 1 0 - t e x t ur e l i br a r y a t t hr e e pi x e l r e s o l u t i o n a n d 3 2 2 3 2 . The numb e r o f we i g ht s i s de t e r mi ne d by mul t i pl y i ng t he n u mb e r o f c a t e g t h e n u mb e r o f we i g h t s p e r c a t e g o r y, o r . i s c a l c ul a t e d ba s e d o n t h e d i me ns i nput s pa c e , , a nd t he numb e r o f o ut put c l a s s e s , . =1 5 f o r t he y b r i d S y s t e m, ARTEX, a n d =1 0 b e c a us e t he r e a r e 1 0 t e x t ur e s . Fo r LP, = = 2 5 . Fo r - NN = 1 = 1 6 . Fo r ARTEX wi t h a l l f e a t ur e s , =2 2 = 3 6 . Fo r ARTEX wi t h no l a r g e - s c a l e f e a t ur e s ( =1 3 ) , = 2 8 . Fo r ARTEX wi t h no br i g h t n e s s f e a t u r e , = 3 4 . Fo r ARTEX wi t h no l a r g e - s c a l e o r br i g ht ne s s f e a t ur e s ( = 1 2 ) , Fo r e x a mpl e , t h e 4 8 , 0 0 0 we i g ht s f o r - NN a r e c o mput e d a s f o l l o ws . Th e y b r i d S y s t f e a t u r e s p e r i nput s a mpl e . i t h - NN, t he s e 1 5 f e a t ur e s pl us t he c o r r e c t c l a s s s t o r e d f o r e a c h t r a i ni ng s a mpl e . The r e f o r e , t he numb e r o f we i g ht s t h a t mu s t b e s ( n u mb e r o f t r a i ni ng s a mpl e s ) . Si nc e t he r e a r e 3 0 0 s a mpl e s /c l a s s a n d 1 0 c l a s s e s , t r a i n i n g s a mpl e s . I n a l l 1 6 2 3 ; 0 0 0 = 4 8 ; 0 0 0 we i g ht s . 18
  128. 128. resul ts w reported f or 30, 60, and 90 hi dden uni ts. ere A T Xw tested w th several con
  129. 129. gurati ons, w th di erent subsets of i ts f eatures R E as i i rem oved. Wth i ts f ul l 17-di m onal f eature set, A T Xachi eved 95. 8 correct af ter i ensi RE onl y one i ncremental trai ni ng epoch, and 96. 3 af ter
  130. 130. ve epochs. B com son, the y pari H d Systemw th K Nachi eved onl y 87. 0 correct af ter one trai ni ng epoch, at the ybri i -N cost of 3, 000 stored exem ars com pl pared to 23 i nternal categori es f or A T X Wth RE . i m l onger trai ni ng ti m (i . e. , 500 trai ni ng epochs usi ng M , or the com uch es LP putati onal l y expensi ve batch-l earni ng procedures usi ng K eans and IT U ), the H d System -m R LE ybri di d not m the perf orm of A T Xw th onl y one i ncrem atch ance RE i ental l earni ng epoch, and exhi bi ted 49 m errors than A T Xw th 5 trai ni ng epochs. ore RE i Three al ternati ve A T Xcon
  131. 131. gurati ons w al so tested to el uci date w A T X RE ere hy R E achi eved better resul ts than the H d System A T Xuses f our spati al scal es versus ybri . RE onl y three f or the H d System T ybri . heref ore, perhaps i ts l argest spati al scal e conf erred an advantage to A T XRE . T s possi bi l i ty w tested by rem ng the l argest scal e, resul ti ng hi as ovi i n a sl i ght perf orm i ncrem (97. 1 ). A ance ent nother uni que f eature used by A T Xi s i ts RE
  132. 132. l l ed-i n surf ace bri ghtness f eature, w ch seem to be m eecti ve than the m ti -scal e hi s ore ul G aussi an bl urri ng used by the H d System R ovi ng the bri ghtness f eature resul ted ybri . em i n a perf orm ance decrem (95. 6 ). T s di erence quanti
  133. 133. es how m surf ace as ent hi uch opposed to boundary properti es i n uence recogni ti on accuracy on these data. Fi nal l y, both the l arge-scal e and the bri ghtness f eatures w rem ere oved. T s resul ted i n a si m l ar hi i perf orm decrem (95. 7 ). ance ent T m he odest rol e pl ayed by the surf ace bri ghtness f eature i n cl assi f yi ng these data i s consi stent w th cogni ti ve evi dence sum ari zed above suggesti ng that boundary i nputs i m that go di rectl y to the hum cogni ti ve recogni ti on systemare of ten su ent to ac- an ci curatel y recogni ze m obj ects. Surf ace bri ghtness and col or properti es becom m any e ore important i nsof ar as the boundary i nf orm on, by i tsel f , i s am guous. G venthat bound- ati bi i ari es are predi cted to be perceptual l y i nvi si bl e w thi n the B S i tsel f (vi z. , the i nterbl ob i C corti cal processi ng stream these resul ts are consi stent w th the possi bi l i ty of bei ng abl e ), i to qui ckl y begi n to recogni ze certai n obj ects usi ng thei r i nvi si bl e boundari es even bef ore these obj ects becom vi si bl e through thei r surf ace properti es. e T A T Xadvantage, even w th
  134. 134. ve A T Xf eatures rem he R E i RE oved, i s probabl y due to som rem ni ng di erences betw e ai eenthe system (1) the nature of band-pass
  135. 135. l teri ng pri or s: to ori entati onal
  136. 136. l teri ng, (2) the bandw dth characteri sti cs of the ori entati onal
  137. 137. l ters, (3) i spati al pool i ng at the thi rd spati al scal e, and/or (4) the cl assi
  138. 138. cati on schem T
  139. 139. rst e. he di erence i s i n the Stage 1 band-pass
  140. 140. l teri ng operati on pri or to the ori entati onal Gabor
  141. 141. l teri ng. T H d Systemuses a Lapl aci an pyramd i n w ch both the center and he ybri i hi surround G aussi ans that m up the band-pass
  142. 142. l ter doubl e i n si ze w th each scal e. In ake i A T X onthe other hand, onl y the surroundG RE , aussi angrow w theachsuccessi ve spati al s i scal e. It preserves on-center resol uti on w l e varyi ng the scal e of i m norm i zati on hi age al and noi se suppressi on. T the H d Systemi s m m restri cti ve i n the range of hus, ybri uch ore spati al f requenci es that are passed through to i ts ori entati onal
  143. 143. l teri ng stage. T second he 19
  144. 144. di erence i s that the ori ented
  145. 145. l ters used by the tw m s have di erent bandw dth o odel i characteri sti cs: the A T X G R E abor
  146. 146. l ters are de
  147. 147. ned w th hi gher-f requency si new i aves (50 hi gher f requency; see A ppendi x I f or param eters). T thi rd di erence i s that he Stage 4 of A T Xperf orm spati al pool i ng f ol l ow ng ori entati onal
  148. 148. l teri ng at eachspati al RE s i scal e. T H d Systemdoes not do thi s i n i ts l argest spati al f requency channel at 8 2 8 he ybri resol uti on. Theref ore, thi s di screpancy m ght hel p expl ai n w A T Xoutperf orm the i hy R E s H d Systemat 8 2 8 resol uti on, but not at l ow resol uti ons. T f ourth di erence ybri er he i s the cl assi
  149. 149. cati on stage. T advantages of the sel f -organi zi ng G he aussi an A T A R MP cl assi
  150. 150. er over those used by the H d Systemare descri bed above. ybri 7.2 L er T tu L raries arg ex re ib In G reenspan (1996), recogni ti on stati sti cs of the H d Systemon a 30-texture l i brary ybri w presented. T s l i brary consi sts of 19 textures f romthe B ere hi rodatz al bum and 11 , addi ti onal textures of com parabl e com exi ty. W w unabl e to obtai n thi s database, pl e ere and so w chose to eval uate A T Xon a l i brary of si m l ar textures obtai ned sol el y f rom e RE i the B rodatz al bum w ch contai ns the 19 textures used i n G , hi reenspan (1996) as a subset. Fi gure 3Bshow thi s l i brary of 42 B s rodatz textures. T pl ate num f romthe B he bers rodatz al bumare l i sted i n the capti on. T 19 textures eval uated i n G he reenspan (1996) com sepri the
  151. 151. rst three row of Fi gure 3, as w l as the

×