ãæ·±å±€åŠç¿ãšèªç¶èšèªåŠçããªãã¯ã¹ãã©ãŒã倧/DeepMind è¬çŸ©ãŸãšãïŒçšèªéä»ãïŒ
Posted on 2018-02-08(æš) in Deep Learning
ãªãã¯ã¹ãã©ãŒã倧ã®ã深局åŠç¿ãšèªç¶èšèªåŠçã(Oxford Deep NLP 2017 course)ã®è¬çŸ©ã¡ã¢ã§ãã
è¬çŸ©ãããªãã¹ã©ã€ããè¬çŸ©ã®è©³çްçã«ã€ããŠã¯ãè¬çŸ©ã®å ¬åŒããŒãžãåç §ããŠãã ããã
ç®æ¬¡
- è¬çŸ©1a: å°å ¥ (Phil Blunsom)
- è¬çŸ©1b: 深局ãã¥ãŒã©ã«ãããã¯åã ã¡ (Wang Ling)
- è¬çŸ©2a: åèªã¬ãã«ã®æå³ (Ed Grefenstette)
- è¬çŸ©2b: å®ç¿ã®æŠèŠ (Chris Dyer)
- è¬çŸ©3: èšèªã¢ãã«ãšRNN ããŒã1 (Phil Blunsom)
- è¬çŸ©4: èšèªã¢ãã«ãšRNN ããŒã2 (Phil Blunsom)
- è¬çŸ©5: ããã¹ãåé¡ (Karl Moritz Hermann)
- è¬çŸ©6: Nvidia ã® GPU ã䜿ã£ã深局åŠç¿ (eremy Appleyard)
- è¬çŸ©7: æ¡ä»¶ä»ãèšèªã¢ããªã³ã° (Chris Dyer)
- è¬çŸ©8: ã¢ãã³ã·ã§ã³ã䜿ã£ãèšèªçæ (Chris Dyer)
- è¬çŸ©9: é³å£°èªè (Andrew Senior)
- è¬çŸ©10: é³å£°åæ (Andrew Senior)
- è¬çŸ©11: 質åå¿ç (Karl Moritz Hermann)
- è¬çŸ©12: èšæ¶ (Ed Grefenstette)
- è¬çŸ©13: ãã¥ãŒã©ã«ãããã«ãããèšèªç¥è (Chris Dyer)
è¬çŸ©1a: å°å ¥
è¬åž«ïŒPhil Blunsom (ãªãã¯ã¹ãã©ãŒã倧 / DeepMind)
-
ã¯ããã«
- AI (人工ç¥èœ) ãšããã°èªç¶èšèª
- èšèªã¯ãã³ãã¥ãã±ãŒã·ã§ã³ã ãã§ã¯ãªããæŠå¿µã衚çŸããããã«äœ¿ããã
- ã©ããã£ãŠäººéãèšèªãåŠç¿ãããã¯ãŸã ããåãã£ãŠããªã
-
æç§æž
- ææ°ã®ç ç©¶ãæ±ããããæç§æžã䜿ããªãããã ããå¿
èŠã«å¿ããŠä»¥äžãåèïŒ
- 深局åŠç¿: Goodfellow, Bengio, Courville: Deep Learning
- æ©æ¢°åŠç¿: Murphy æ¬ ãš Bishop æ¬
- ææ°ã®ç ç©¶ãæ±ããããæç§æžã䜿ããªãããã ããå¿
èŠã«å¿ããŠä»¥äžãåèïŒ
-
åæç¥è
- æ°åŠ (ç·åœ¢ä»£æ°, 埮åç©å, 確ç)
- æ©æ¢°åŠç¿ (è©äŸ¡; éåŠç¿, äžè¬å, æ£åå; ãã¥ãŒã©ã«ãããã¯ãŒã¯ã®åºç€)
- ããã°ã©ãã³ã° (èšèªããã¬ãŒã ã¯ãŒã¯ã¯åããªã)
- èªç¶èšèªåŠç (NLP) ãèšç®æ©èšèªåŠ (CL) ã®å æ¬çãªã³ãŒã¹ã§ã¯ã¯ãªã
-
ã¿ã¹ã¯
- èšèªçè§£
- CNN ã®ãã¥ãŒã¹èšäºãäžãã質åã«çããã
- 倿 (transduction) ã¿ã¹ã¯ãç³»åããç³»åãžã®å€æã
- é³å£°èªèïŒé³å£° â ããã¹ã
- æ©æ¢°ç¿»èš³ïŒããã¹ã(èšèªX) â ããã¹ã(èšèªY)
- é³å£°åæïŒããã¹ã â é³å£°
- ç»åçè§£
- ããã®ç·æ§ã®èŠå㯠2.0 ãïŒã
- èŠåãšã¡ã¬ãã«é¢ããç¥èãæã£ãŠããªãããã°ãããªã
- ããã®ç·æ§ã®èŠå㯠2.0 ãïŒã
- èšèªçè§£
-
èšèªæ§é
- å€çŸ©æ§: "I saw her duck" (泚: "duck"ã¯ãã¢ãã«ããšããæå³ãšã身ãããããããšããæå³ããã)
- æ £çšå¥: "kick the bucket" âããæ»ã¬ã
- ç
§å¿: ããŒã«ã¯ç®±ã®äžã«å
¥ããªãã£ããããã¯[倧ãããã/å°ãããã]ããã ã
- ããããã¯ç®±ããããŒã«ãã
| è±èª | æ¥æ¬èª |
|---|---|
| compelling | 人ãåŒãã€ãã |
| prerequisite | åæç¥è |
| comprehensive | å æ¬ç㪠|
| 20/20 vision | ïŒæ¥æ¬ã§ã®ïŒèŠå 1.0 |
| duck (a.) | ã¢ãã« |
| duck (v.) | ãããã |
| kick the bucket | æ»ã¬ |
è¬çŸ©1b: 深局ãã¥ãŒã©ã«ãããã¯åã ã¡
è¬åž«ïŒWang Ling (DeepMind)
ããŒã1
- æ°å
- 倿°
- æŒç®å
-
颿° (å ¥å, åºå)
- 翻蚳, å²ç¢ã®æã®æšå®, ç»ååé¡ãããçš®ã®é¢æ°
-
颿°ãã©ãæšå®ããã
- ãã©ã¡ãŒã¿ã䜿ã£ãŠã¢ãã«åããå ¥åãšåºåãããã©ã¡ãŒã¿ãæšå®ãã©ãæšå®ãããïŒ
- ãã©ã¡ãŒã¿ã®ã仮説ããç«ãŠããã®ãã©ã¡ãŒã¿ã䜿ã£ãåºåãšå®éã®åºåãšãæ¯ã¹ãã
- 仮説ã®ãè¯ãã â æå€±é¢æ°ã«ãã£ãŠå®çŸ© (äŸ: äºä¹æå€±)
-
æé©å
- æå€±é¢æ°ãæå°åãã
- åçŽãªæé©åïŒæå€±é¢æ°ãäžããæ¹åã«ãçŸåšã®ãã©ã¡ãŒã¿ã 1 ãã€åãããŠãã
- æé©è§£ãèŠã€ãããããšã¯éããªãããæ€çŽ¢åé¡ããšåŒã°ããã
-
æé©åã®æ¹å
- ã¹ããããµã€ãºãå°ãããã (äŸ: 1 ã®ä»£ããã« 0.1 ãã€åãã)
- ã¹ããããµã€ãºãå°ããã€å°ããããŠãã
-
ãã®ã¢ãããŒããç»ååé¡ã«é©çšãããšïŒ
- ç»åå šäœãå ¥åã1ãã¯ã»ã« = 1倿°
- ã¢ãã«ã倧ãããã
- ãµã³ãã«æ°ãå€ããã
-
åŸé ïŒ
- ãã©ã¡ãŒã¿ãåãããã¯ãã«ãä»®å®ããç§»åããè·é¢ãããã®æå€±ã®æžå°å¹ ãä»®å® â limit â æå€±é¢æ°ã®åŸ®å!
- ã³ã¹ã颿°èªäœã®ä»£ããã«ã埮åãèšç®ããã°è¯ã
-
ææ¥éäžæ³
- çŸåšã®ãã©ã¡ãŒã¿ãããr = æå€±é¢æ°ã®åŸ®åïŒåäœè·é¢ãããã®æå€±æžå°ïŒãèšç®
- r * a (åŠç¿ç)
ããŒã2 - 深局åŠç¿å ¥é
-
çŸå®ã®ã¢ãã«ã¯éç·åœ¢
- äŸïŒïŒã€ã®ç·åœ¢ã¢ãã«ã®çµã¿åããããããããå€ (äŸ: x = 6) ããç·åœ¢ã¢ãã«ãåãæ¿ãã
- äžã€ã®ç·åœ¢ã¢ãã«ã§ã¯åŠç¿ãè¶³ããªã (underfitting)
-
ã¢ãã«ã®çµã¿åãã
- ã·ã°ã¢ã€ã颿° ãã¢ãã«éã¿ã«äœ¿ã y = (w1 x + b1) s1 + (w2 x + b2) s2
- s = Ï(wx + b)
- wã倧ãããããšã»ãŒã¹ããã颿°ã«ãªã
-
å€å±€ããŒã»ãããã³
- äŸïŒïŒã€ã®ç·åœ¢ã¢ãã«ã®çµã¿åããã¯ïŒãnot s1 and not s3ããã©ã衚çŸããïŒ
- æåã®ã¬ã€ã€ãŒã¯å¢çæ¡ä»¶ãåŠç¿
- 次ã®ã¬ã€ã€ãŒã¯ç¯å² (è«çåãè«çç©) ãåŠç¿ãetc.
- ïŒå±€ã䜿ããš XOR ãåŠç¿ã§ãã â ååã®å±€ãšãã©ã¡ãŒã¿ãããã°ãä»»æã®é¢æ°ãè¿äŒŒã§ãã
-
æªåŠç¿ (underfitting)ïŒã¢ãã«ã®è¡šçŸåãïŒããŒã¿ã®è€éãã«å¯ŸããŠïŒäœããã
-
éåŠç¿ (overfitting)ïŒã¢ãã«ã®è¡šçŸåãé«ããã
- ããŒã¿éãå¢ãã â éåŠç¿ã®ãªã¹ã¯ãäžãã
- æ£åå â ã¢ãã«ã®è€éãã«å¯Ÿããããã«ã㣠â éåŠç¿ã®ãªã¹ã¯ãäžãã
-
颿£å€ã®æ±ãæ¹
- ã«ãã¯ã¢ããã»ããŒãã«ã䜿ã
- é£ç¶å€ã®éåã§ã颿£å€ãè¡šçŸ â åã蟌ã¿(embedding)
- åºå logit ã®éåã softmax ã§ç¢ºçååžã«å€æ
è¬çŸ©2a: åèªã¬ãã«ã®æå³
è¬åž«: Ed Grefenstette (DeepMind)
åèªãã©ã衚çŸãããïŒ
- èªç¶èšèªã®ããã¹ãïŒé¢æ£çãªèšå·ã®ç³»å
- åçŽãªè¡šçŸïŒone-hot ãã¯ãã«
- åèªãã¯ãã«
åé¡ïŒã¹ããŒã¹ãåèªå士ãçŽäº€ãæå³ãæ±ããªã
-
ååžé¡äŒŒåºŠ
- åèªã®æå³ãæèã«ãã£ãŠè¡šçŸãå¯ãªãã¯ãã«
- é »åºŠããŒã¹ãäºæž¬ããŒã¹ãã¿ã¹ã¯ããŒã¹ã®ïŒã€ã®æšå®ææ³
-
é »åºŠããŒã¹ã®ææ³
- æèïŒåèªã®å·Šå³ã® w åèªïŒã«ãããä»ã®åèªã®åºçŸé »åºŠãæ°ãã -> ãã¯ãã«å
- åèªå士ã®é¡äŒŒåºŠãã³ãµã€ã³é¡äŒŒåºŠã«ãã£ãŠèšç® ãã¯ãã«ã®é·ãã«äŸåããªã
- çŽ æ§ã®äžå¹³çã é¢é£ãããããé »åºŠãé«ãã®ãããã åã«ãã®åèªã®é »åºŠãé«ãã®ã æ§ã ãªæ£èŠåææ³(äŸïŒTF-IDF, PMI)
-
ãã¥ãŒã©ã«åã蟌ã¿ã¢ãã«
- åèª-çŽ æ§ã®è¡åãèãã
- one-hot ãã¯ãã«ãããã°ãæãåããããšåã蟌ã¿ãæ±ãŸã
-
ãã¥ãŒã©ã«åã蟌ã¿
- åèª t ã«å¯ŸããŠãã³ãŒãã¹å ã®æèã«ãããçŽ æ§ïŒä»ã®åèªïŒ c(t) ã®åºçŸãæ°ãã
- ã¹ã³ã¢é¢æ°ãå®çŸ©
- æå€±é¢æ°ããã¹ã³ã¢ã®ã³ãŒãã¹å šäœã®åãšå®çŸ©
- æå€±é¢æ°ãæå°å
- E ãåã蟌ã¿
-
è¯ãã¹ã³ã¢é¢æ°ãšã¯ïŒ
- åèªãåã蟌ã¿ã§è¡šçŸïŒåœç¶ïŒïŒ
- t ã c(t) ã«ãã£ãŠã©ã®ãããããŸã説æã§ãããã衚çŸã§ãã
- ããåèªããïŒãã®ä»ã®åèªãããïŒ æèãããŸã衚çŸã§ãã
- 埮åå¯èœ
-
C&W ã¢ãã« (Collobert et al. 2011)
- åã蟌㿠â ç³ã¿èŸŒã¿ â å€å±€ããŒã»ãããã³ â ã¹ã³ã¢
- æ£ããåèªã®ã¹ã³ã¢ - ééã£ãïŒç¡äœçºã«éžãã åèªã§çœ®ãæããåïŒã¹ã³ã¢ã®ãã³ãžæå€±ãæå€§åããããã«åŠç¿
- çµæãšããŠãæèã®è¡šçŸã«é¢ããæ å ±ãåã蟌ã¿ã«ãšã³ã³ãŒãã£ã³ã°ãããããã«ãªã
-
CBoW (Mikolov et al. 2013)
- æèããåèªãäºæž¬
- æèã®åèªãåã蟌ã¿è¡šçŸãåãèšç®ããæåœ± â softmax
- ç·åœ¢ã§éã 以åã¯ãèšç®éã®é«ã softmax ã®ä»£ããã« negative sampling
- æè¿ã¯ softmax ããã®ãŸãŸèšç®
-
Skip-gram (Mikolov et al. 2013)
- åèªããæèãäºæž¬
- åèªã®åãèŸŒã¿ â æåœ± â softmax â æèåèªã®å°€åºŠ
- éãïŒ
- 深局åŠç¿ã§ã¯ãªãïŒ
-
äž¡ææ³ã®æ¯èŒ
- é »åºŠããŒã¹ vs ãã¥ãŒã©ã«åã蟌ã¿ã¢ãã« â åãèããå ±æ
- word2vec == PMI (ç¹çžäºæ å ±é) è¡åã®åè§£ (Levy and Goldberg, 2014)
-
ãã¥ãŒã©ã«ææ³ã®é·æ
- å®è£ ã»åŠç¿ã容æ
- é«ã䞊å床
- ä»ã®é¢æ£çãªæŠå¿µã䜿ãã (ä¿ãåããåè©ãªã©)
- ç»åãªã©ã®é£ç¶å€æèã«ã䜿ãã
-
å éšçãªè©äŸ¡
- WordSim-333, SimLex-999, ã¢ãããžãŒ (æåãªäŸïŒqueen = king - man + woman), å¯èŠå
-
å€éšçãªè©äŸ¡
- ä»ã®ã¿ã¹ã¯ã®æ§èœãäžããããã«äœ¿ã
-
ã¿ã¹ã¯ããŒã¹
- åã蟌ã¿ããã¥ãŒã©ã«ãããã®å ¥åãšããŠäœ¿ã
- åã蟌ã¿ïŒçŽ æ§ïŒããã¥ãŒã©ã«ãããã®ãã©ã¡ãŒã¿ã®äžéš
- çŽ æ§è¡šçŸã®åŠç¿
- åçŽãªäŸ
- æ/ææžåé¡ â Bag of Vectors ïŒåã蟌ã¿ãã¯ãã«ã®åïŒ â æåœ± â softmax
- ã¿ã¹ã¯äŸåã®è¡šçŸãåŠç¿ïŒææ æšå® â è¯å®çã»åŠå®çãªåèªïŒ
- åèªã®æå³ã¯ãã¿ã¹ã¯ã«æ ¹ä»ããŠããïŒã¿ã¹ã¯ãæå³ã決ããïŒ
-
äºèšèªçŽ æ§ (Herman & Blunsom 2014)
- è±èªã®æ e_i ãšãã€ãèªã®æ g_i ã®é¡äŒŒåºŠãæå€§åããã
- åçŽãªåã®ãããã«ã飿¥ããåèªã®éã«éç·åœ¢æ§ãå°å ¥ (tanh)
- æå€±: å·®ãæå°å 0 ã«çž®éããªãããã«ã察蚳ãšé察蚳ãšã®éã®å·®ãæå€§å
- çŽæïŒå¯Ÿèš³æã¯ãé«ã¬ãã«ã®æå³ãå ±æãã
-
ãŸãšã
- ã¿ã¹ã¯ã«ç¹æã®æ
å ±ãåŠç¿ã§ãã ãã ãããããäžè¬çãªãæå³ããåŠç¿ããŠããä¿èšŒã¯ãªã
- ãã«ãã¿ã¹ã¯ç®ç颿°ã§ããçšåºŠè»œæžã§ããã
- äºååŠç¿ (pre-trainig) ããŠåºå®
- 転移åŠç¿ã®äžåœ¢æ ã¿ã¹ã¯åºæã®èšç·ŽããŒã¿ãå°ãªãã£ãããèªåœã®ã«ããŒçãå°ããã£ããããæã«æçš
- ã¿ã¹ã¯åºæã®èšç·ŽããŒã¿ã倧ãããšãã«ã¯ãäžè¬æ§ãç ç²ã«ããŠããåã蟌ã¿ãåŠç¿ããã»ããè¯ã
- ã¿ã¹ã¯ã«ç¹æã®æ
å ±ãåŠç¿ã§ãã ãã ãããããäžè¬çãªãæå³ããåŠç¿ããŠããä¿èšŒã¯ãªã
| è±èª | æ¥æ¬èª |
|---|---|
| literature | éå»ã®æç® |
| corrupt | (ããŒã¿ããããš) ç Žæããã |
| intrinsic | å éšç㪠|
| extrinsic | å€éšç㪠|
| salient | éèŠãª |
è¬çŸ©2b: å®ç¿ã®æŠèŠ
è¬åž«ïŒChris Dyer (DeepMind/CMU)
- ïŒçš®é¡ã®å®ç¿
- å ¥éå®ç¿
- æ¬å®ç¿
å®ç¿1
-
èšèªã®ãæç¥ããšã衚çŸã
- æªç¥èª
- ããŒã¯ãã€ãº ("New York City" ã¯ïŒããŒã¯ã³ãïŒïŒããŒã¯ã³ã)
- 倧æåã»å°æå
-
ã³ãŒãã¹
- ãã¥ãŒã¹èšäº vs twitter
- Heap's Law: ã³ãŒãã¹ã®ãµã€ãºãå¢ããã«ãããã£ãŠèªåœã®ãµã€ãºãå¢ããã
- twitter 㯠α ãå°ããïŒæ¥æ¿ã«ïŒå¢ããïŒã·ã³ã°ã«ãã³ïŒäžåºŠããåºçŸããªãåèªïŒã 70%
å®ç¿1ã§ã¯ã衚çŸåŠç¿ãæ±ã
æ¬å®ç¿
-
ããã¹ãåé¡ (e.g., ã¹ãã ãã£ã«ã¿)
-
èªç¶èšèªçæ (NLG)
- èšèªã¢ããªã³ã°
- ã¿ã€ãä¿®æ£
- æ¡ä»¶ä»ãèšèªã¢ããªã³ã°
-
èªç¶èšèªçè§£
- 翻蚳ãèŠçŽããã£ããããã (+NLG)
- æç€ºçè§£
- 質åå¿ç
- 察話ã€ã³ã¿ãŒãã§ãŒã¹
-
è§£æ
- ãããã¯ã¢ããªã³ã°
- èšèªè§£æïŒäŸïŒ 圢æ çŽ è§£æãæ§æè§£æïŒ
-
ããŒã¿ã»ãã
- åäžã®ããŒã¿ã»ãã TED Talks ã䜿ã
- åäžã®ããŒã¿ã»ãããè²ã ãªåé¡ã«å€æããã¹ãã« â æ©æ¢°åŠç¿ã§ã¯éèŠïŒ
- ãããã¯ã©ãã«ãã¿ã€ãã«ãèŠçŽããããªããããªãšã®ã¢ã©ã€ã³ã¡ã³ãã翻蚳
-
å ¥éå®ç¿ -> TED Talks ããåèªåã蟌ã¿ãåŠç¿
-
æ¬å®ç¿
- TED ã®ã©ãã«ãäºæž¬
- TED ã®ã©ãã«ããããŒã¯ãçæ
- TED ã®ããŒã¯ç¿»èš³åš
- TED ã®ããŒã¯ããèŠçŽãçæ
- ããéšåã話ãã®ã«ãããæéãæšå®
- èŽè¡ãç¬ã£ããã©ãããäºæž¬
-
ããŒã«ããã
- èªå埮å â é床ãéèŠïŒééãããã
- éç (TensorFlow, Theano) pros: èšç®ã°ã©ãã®æé©å cons: æŒç®ãéãããŠãã
- åç (DyNet, PyTorch) pros: äœã§ãæžãã cons: æé©åããã«ãã
| è±èª | æ¥æ¬èª |
|---|---|
| practical | å®ç¿ |
| percept | ç¥èŠ |
| derivative | 埮å |
è¬çŸ©3: èšèªã¢ãã«ãšRNN ããŒã1
è¬åž«ïŒPhil Blunsom
-
èšèªã¢ãã«ãšã¯
- åèªã®ç³»åã«ç¢ºçãäžãã
- æ ¹æ¬çãªåé¡
- 翻蚳 â æ§æãèªé ã®è§£æ¶
- é³å£°èªè â åèªãã§ã€ã¹ã®ææ§æ§è§£æ¶
-
æŽå²
- æŠæã®æå·çè« ãã€ãã® Enigma æå·ã®è§£èª â ããã€ãèªããããçºè©±
- å€ãã®èªç¶èšèªåŠçã¿ã¹ã¯ã¯ã(æ¡ä»¶ä»ã)èšèªã¢ããªã³ã°ã«åž°çã§ãã
- äŸ: 翻蚳ã質åå¿çã察話
-
èšèªã¢ãã«ã®åºæ¬
- é£éåŸ (Chain rule) ã䜿ã£ãŠãåæååžãæ¡ä»¶ä»ãååžã®ç©ã«åè§£
- â éå»ã®å±¥æŽãããæ¬¡ã®åèªãäºæž¬ããåé¡ã«å€æ
- 倧éã®ããŒã¿ãç°¡åã«ååŸã§ãã
- èªç¶èšèªãçè§£ããããšåæ§ã«é£ãã
- äŸïŒP(| There she built a ) â éåžžã«å€ãã®å¯èœæ§ é£ãã
- äŸ: P(| Alice went to the beach. There she built a ) â "sand castle" "boat" etc.
- there â the beach, she â Alice ã®ç §å¿é¢ä¿ãçè§£ããã€ãã»ãã³ãã£ã¯ã¹ãçè§£
-
è©äŸ¡
- ã¯ãã¹ãšã³ããããŒ
- ããã¹ããèšèªã¢ãã«ã§ãšã³ã³ãŒãããæã«å¿ èŠãªãããæ°
- Perplexity
- 2 ã®ã¯ãã¹ãšã³ããããŒä¹
- ååèªãèŠããšãã®ã¢ãã«ã®ãé©ã床åãã
- æç³»åäºæž¬åé¡
- èšç·ŽããŒã¿ãšã¯å¥ã®ãã¹ãããŒã¿ã䜿ã
- ã¯ãã¹ãšã³ããããŒ
-
ããŒã¿
- å°ãªããšã 10ååèªã¯å¿ èŠ
- PTB (Penn Treebank)
- å°ãã
- å å·¥ããéã
- Billon Word Corpus
- æãã©ã³ãã ã«å ¥ãæ¿ããŠèšç·Žã»ãã¹ãã»ãããäœæ
- ãæªæ¥ãšéå»ãåé¢ããèšäºãåé¢ãã®ïŒã€ã®ååã«åããŠãã
- äžã®ïŒã€ãšãæ¬ é¥ãããã®ã§æ¬åœã¯äœ¿ãã¹ãã§ã¯ãªã
- WikiText datasets
- ãªã¹ã¹ã¡
-
Nã°ã©ã ã¢ãã«
- ãã«ã³ãä»®å®
-
æåŸã® k - 1 åã ãèŠãŠæ¬¡ãäºæž¬ãk次ãã«ã³ãã¢ãã«
- å€é ååžãæ±ããã®ãç°¡åãã¹ã±ãŒã©ãã«
- äŸïŒãã©ã€ã°ã©ã P(w3 | w1, w2) = count(w1, w2, w3) / count(w1, w2)
-
ããã¯ãªã
- æå°€æšå®ãè¯ããšã¯éããªã
- "Oxford Primm's eater" â ã³ãŒãã¹äžã«ããããäžåºŠãçŸããªã
- ãã€ã°ã©ã 確çãšè£éãã
- åçŽãªææ³ã®äžã€ïŒç·åœ¢è£éããã¯ãªã
- è¶ å·šå€§ããŒã¿ãããã°ãå²ãšåçŽãªææ³ã§ãããŸããã
- æãäžè¬çãªææ³ Kneser-Ney
-
èšèªã¢ãã«ãé£ããçç±
- ãã³ã°ããŒã«
- ã©ããªã«ã³ãŒãã¹ã倧ããããŠãæªç¥èªããã
- ã«ãŒã«ããŒã¹ã®äººå·¥ç¥èœãããŸãè¡ããªãçç±ã§ããã
- é·æ
- éã
- è©äŸ¡ã宿°æé
- èšèªã®å®éã®ååžã«ããã
- çæ
- é·è·é¢ã®äŸåé¢ä¿ãæ±ããªã
- dog/cat ãªã©ã®æå³ã®äŒŒãååžã圢æ è«ãæ±ããªã
-
ãã¥ãŒã©ã«Nã°ã©ã ã¢ãã«
-
ãã©ã€ã°ã©ã ã®å Žå
- wn-2, wn-1 (one-hot ãã¯ãã«)
- çŽåïŒåèªãå ¥åãšããŠãwn ã®ååžã softmax ã§åºåããååããã¥ãŒã©ã«ããã
- åºåã® softmax å±€ã¯å·šå€§ïŒïŒå ¥å㯠one-hot vector ãªã®ã§ããã»ã©ã§ããªãïŒ
-
ãµã³ããªã³ã°
- çŽåïŒåèªãå ¥åãåºåå±€ã®ç¢ºçãããšã«æ¬¡ã®åèªããµã³ãã«ã...
- ãã³ãŒãã£ã³ã°(埩å·å)ã®åºæ¬
- 確çãããšã«æå€§ã®ãã®ãéžã¶
-
åŠç¿
- å®éã®åèªãšããã®åèªã®ç¢ºçãšã®éã§æå€±é¢æ°(log probability)ãå®çŸ©
- éäŒæ¬
- ã¿ã€ã ã¹ãããããšã«å±é (unrolling)
- éäŒæã¯æšæ§é ã«ãªã -> è€æ°CPU/GPU ãã¯ã©ã¹ã¿äžã§åæ£å¯èœ
-
é·æ
- æªç¥Nã°ã©ã ãžã®äžè¬å
- åã蟌ã¿ã«ãããé¡çŸ©èªãããŸãæ±ãã
- Nã°ã©ã ã¢ãã«ããçã¡ã¢ãªãŒïŒç·åœ¢çŽ æ§ã®å ŽåïŒ
- çæ
- nã®å€ã«åŸããã©ã¡ãŒã¿æ°ãå¢å
- é·è·é¢ã®äŸåãæ±ããªã
- èšèªã®å®éã®ååžãä¿èšŒããªã
-
-
RNN ååž°åãã¥ãŒã©ã«ããã
- ç¡éã®å±¥æŽ
-
çŽåã®é ãå±€ -> 次ã®é ãå±€ãžã®ãªã³ã¯
-
éäŒæ
- å±éãããš DAG(æåéå·¡åã°ã©ã) ã«ãªã
- éåžžéã誀差éäŒæã§ãã (BPTT; Back Propagation Through Time) æé軞äžã®éäŒæ
- ãã ããååèªãé ãå±€ã¯ç¬ç«ã«èšç®ã§ããªã
-
Truncated Back Propagation Through Time
- éäŒæ¬ãéäžã§åæãã (ååãäŒæã®æã¯åæããªã)
-
ãããããå
- BPTT ã䜿ããšãã·ãŒã¯ãšã³ã¹ãã°ãã°ãã®é·ãïŒããããããæé·ã®é·ãã«åãããªããã°ãªããªãïŒ
- TBPTT ã䜿ããšãåã·ãŒã¯ãšã³ã¹ãäžå®ã®é·ãã«åãŸã (GPUã§é«éåãããã)
-
é·æ
- é·è·é¢ã®äŸåãæ±ãã (翻蚳çãããããã«ã¯å¿ é )
- å±¥æŽãé ãå±€ã«å§çž®, äŸåã®é·ãã«å¿ããŠãã©ã¡ãŒã¿æ°ãå¢ããªã
- çæ
- åŠç¿ãé£ãã
- é ãå±€ã®ãµã€ãºã®2ä¹ã«åŸã£ãŠã¡ã¢ãªãå¢ãã
- èšèªã®å®éã®ååžãä¿èšŒããªã
-
ãã€ã¢ã¹ã»ããªã¢ã³ã¹ã®ãã¬ãŒããªã
- äŸïŒæãèŠããŠæ°ããã ãã®èšèªã¢ãã« â äœãã€ã¢ã¹ãé«ããªã¢ã³ã¹
- Nã°ã©ã ã¢ãã«ïŒãã€ã¢ã¹æããäœããªã¢ã³ã¹
- RNN: ãã€ã¢ã¹ãæžãã
| è±èª | æ¥æ¬èª |
|---|---|
| utterance | çºè©± |
| inflate | (æ°åã)宿 以äžã«å€§ãããã |
| flawed | æ¬ é¥ããã |
| power law | ã¹ãä¹å |
| amenable | ãã«é©ããŠãã |
| feed | (ãã¥ãŒã©ã«ãããã«ãå®ããŒã¿ã)äžãã |
| unroll | (RNNã)å±éãã |
| truncate | (äœåãªãã®ã)åãåã |
| esoteric | é£è§£ãª |
| strawman | ãããå° |
è¬çŸ©4: èšèªã¢ãã«ãšRNN ããŒã2
è¬åž«ïŒPhil Blunsom
RNN ã Nã°ã©ã èšèªã¢ãã«ãããæ§èœãè¯ãã®ã ãšããããäœãé·è·é¢ã®äŸåé¢ä¿ãæããããŠããã¯ãã â ããããæ¬åœã«åŠç¿ã§ããŠãããïŒ
-
æ¶ããïŒççºããïŒåŸé åé¡
- æé軞äžãããã®ãŒãåŸé ã«äœãèµ·ããŠãããïŒ
- zn = éç·åœ¢é¢æ°ã®äžèº«
- Vh -> zn ãçŽåã®é ãå±€ hn-1ã§å埮å
- èª€å·®é¢æ°ã® h1 å埮å -> Vh ãäœåãæãåããã
- Vh ã®ã¹ãã¯ãã«ååŸ (åºæå€) ã¯å€ãã®å Žåãå°ãã -> è·é¢ã«ãããã£ãŠåŸé ãææ°é¢æ°çã«å°ãããªã
-
解決ç
- ïŒé埮å䜿ã (äŸïŒLBFGS) (ã¹ã±ãŒã«ããªã)
- åŸé ãæ¶ããªããããªåæåããã
- æ ¹æ¬çãªè§£æ±ºçïŒã¢ãŒããã¯ãã£ãå€ããŠããŸãããšïŒ
-
LSTM (Long Short Term Memory)
- ã»ã«ç¶æ cn (èšæ¶) ãå°å ¥
- LSTM ã®ããŒïŒçŸåšã®ã»ã«ç¶æ
= f * çŽåã®ã»ã«ç¶æ
+ äœã
- 泚ïŒRNN ã®ãããªæãç®ã§ã¯ãªããè¶³ãç®
- ããã«éç·åæ§ãå ¥ããªãã®ãéèŠ
- äœã -> i (å ¥åã²ãŒã) * tanh(å ¥å; çŽåã®é ãå±€)
- f -> å¿åŽ (forget)
- å®è£ ïŒå€ãã®ç·åœ¢å€æãã²ãšãŸãšãã«ã§ãã
- å€çš®ïŒi ã®ãããã« (1 -f)
-
Gated Recurrent Unit (GRU)
- ã»ã«ç¶æ ãç¡ããh ã ã
- "ã²ãŒãä»ãåã»ã« (Gated additive cells)" 㯠(åæåãªã©ã工倫ããªããŠã)ããŸããã
- æ©æ¢°ç¿»èš³ãé³å£°èªèãé³å£°åæã¯ã ããã LSTM
-
æ·±ã RNN ã«åºã¥ãèšèªã¢ãã«
- 瞊æ¹åã«é·ããã -> æå» t ã§è€æ°ã®é ãå±€ (èšæ¶ãå¢ãã)
- 暪æ¹åã«é·ããã -> Recurrent Highway Network
-
ã¹ã±ãŒãªã³ã°
- ãã¥ãŒã©ã«èšèªã¢ãã«ã®èšç®é -> èªåœã®ãµã€ãºã«å€§ããªåœ±é¿ãåãã
- ç¹ã«æåŸã® softmax å±€
- short-list â é«é »åºŠèªã ããã¥ãŒã©ã«ã¢ãã«ã䜿ããä»ã¯n-gram -> ãã¥ãŒã©ã«ã¢ãã«ã®å©ç¹ã殺ããŠããŸã
- Batch local short-list -> ãããå ã®èªåœã ãã䜿ããä¹±æŽãªè¿äŒŒïŒäžå®å®
- åŸé ãè¿äŒŒ -> softmax ã exp ã§çœ®ãæãã忝ãéããã©ã¡ãŒã¿ã§çœ®ãæãã
- NCE (Noise Contrastive Estimation)
- ããŒã¿ãæ¬åœã®ååžããæ¥ãŠããããã€ãºã®ããååžããæ¥ãŠãããã®äºå€åé¡åš
- èšç·Žæéãåæž
- ãã¹ãæã«ã¯éåžžã® softmax ãèšç®ããã®ã§ãéããªããªã
- Important Sampling (IS)
- æ¬åœã®åèªãšãã€ãºã®ãããµã³ãã«ã®éã®å€å€åé¡åš
- èŠçŽ ã«åè§£ (factorization)
- Brown ã¯ã©ã¹ã¿ãªã³ã°çã䜿ã£ãŠã¯ã©ã¹ã«åé¡
- ã¯ã©ã¹ã®ååžãšã¯ã©ã¹å ã®ååžã®ïŒã€ã® softmax ã«åè§£
- å€å±€å äºåæš (ãã€ããªã³ãŒããååèªã«ä»äž)
- çæïŒäºåæšãäœãã®ãé£ãã GPU ã§éããã«ãã
-
ãµãã¯ãŒãã¢ãã«
- åèªã®ãããã«æåã¬ãã«
- äžåœèªãªã©ã¯åèªã®æŠå¿µãããããææ§
- softmax ãéã, æªç¥èªãç¡ã (äŸïŒäººå) ãããã«ãäŸåã®è·é¢ãé·ããªã
- 圢æ åŠçãªåèªå ã®æ§é ãæ±ãã äŸïŒdisunited, disinherited, disinterested
- perplexity ã§ã¯åèªã¬ãã«ã«ã¯ãŸã æµããªãããèšèªã¢ãã«ã®æªæ¥
-
æ£èŠå
-
Dropout
- 0/1 ã®ããããã¹ã¯ããµã³ãã«ãé ããŠãããã«ä¹ç®
- ãªã«ã¬ã³ããªæ¥ç¶ (äŸïŒé ãå±€é) ã«é©çšããŠã广çã§ã¯ãªã
- çç±: æé軞äžã§ãããç¹°ãè¿ããšãããæéãçµéãããšå šãŠã®é ãç¶æ ããã¹ã¯ãããŠããŸã
- ãªã«ã¬ã³ãã§ã¯ãªãæ¥ç¶ïŒäŸïŒå ¥åâé ãå±€) ã ãã« Dropout ãé©çš
- ããæ¡ãããæ¹æ³ïŒéåŠç¿ããã»ã©ãããã¯ãŒã¯ã倧ããããåŒ·ãæ£èŠåãããã
-
Bayesian Dropout
- ãªã«ã¬ã³ããªæ¥ç¶éã§å ±æããã Dropout ãã¹ã¯ã䜿ã
- æããšã«éãéã¿ã䜿ã
-
| è±èª | æ¥æ¬èª |
|---|---|
| compelling | é åç㪠|
| Finnish | ãã£ã³ã©ã³ãèª (圢æ è«ãç¹ã«è€éãªããšã§æå) |
| Turkish | ãã«ã³èª |
| hone in on | ..ã«çŠç¹ããããã |
| modus operandi | 決ãŸã£ãããæ¹ |
è¬çŸ©5: ããã¹ãåé¡
è¬åž«ïŒKarl Moritz Hermann (DeepMind)
-
ããã¹ãåé¡ãšã¯ïŒ
- ã¹ãã åé¡
- èšäºã®ãããã¯
- ãã€ãŒãã®ããã·ã¥ã¿ã°äºæž¬
-
åé¡ã®çš®é¡
- äºå€ (true/false)
- å€å€
- å€ã©ãã«
- ã¯ã©ã¹ã¿ãªã³ã°
-
åé¡ã®æ¹æ³
- æåïŒæ£ç¢ºã ãé ããé«ãïŒ
- ã«ãŒã«ããŒã¹ïŒæ£ç¢ºã ããã«ãŒã«ã人æã§æžãå¿ èŠãããïŒ
- çµ±èšããŒã¹ïŒèªåã§é«éã ããèšç·ŽããŒã¿ãå¿ èŠïŒ
-
çµ±èšçããã¹ãåé¡
- P(c|d) (c ... ã¯ã©ã¹ãd ... ããã¹ã/ææž)
- è¡šçŸ ããã¹ã -> d
- åé¡ P(c|d)
-
衚çŸ
- BoW (bag-of-words)
- æåã§äœã£ãçŽ æ§
- çŽ æ§åŠç¿
-
çæ vs èå¥ã¢ãã«
- çæã¢ãã« P(c, d) æœåšå€æ°ãšèŠ³æž¬å€æ°ã®åæååžã«ç¢ºçãä»äž
- Nã°ã©ã , é ããã«ã³ãã¢ãã«, 確ççæèèªç±ææ³, etc.
- èå¥ã¢ãã« P(c | d) ããŒã¿ãäžããããæã®æœåšå€æ°ã®ååžã«ç¢ºçãä»äž
- ããžã¹ãã£ãã¯ååž°
- çæã¢ãã« P(c, d) æœåšå€æ°ãšèŠ³æž¬å€æ°ã®åæååžã«ç¢ºçãä»äž
-
ãã€ãŒããã€ãº
- ãã€ãºã®æ³å
- P(c | d) æ¯äŸ P(c) P(d|c) -> ææžãåèªã«åè§£ P(c|t_i)
- ã©ãã«ä»ãã®èšç·ŽããŒã¿ã®çµ±èšãåãã ã
- ããã€ãŒãã-> å šãŠã®åèªã¯ç¬ç«ãææžã®ç¢ºçãåèªã®ç¢ºçã®ç©ã§è¿äŒŒãå®éã¯ãã£ããããŸããã
- MAP (maximize a posteriori)
- 倧éã®å°ãã確çã®ç©ã¯ããªãã㌠â 察æ°ç©ºéã§èšç®
- 確çãŒããâ ã¹ã ãŒãžã³ã°
- é·æïŒã·ã³ãã«ãè§£éå¯èœãéã
- çæïŒç¬ç«ä»®å®ãæã»ææžã®æ§é ãèæ ®ããŠãªãããŒã確ç
- ãã€ãŒããã€ãºã¯çæã¢ãã«ïŒ
-
çŽ æ§è¡šçŸ
- äºå€ã»å€å€ã»é£ç¶å€
-
ããžã¹ãã£ãã¯ååž°
- ããžã¹ãã£ãã¯ïŒããžã¹ãã£ãã¯é¢æ°ã䜿ããååž°ïŒçŽ æ§ãšéã¿ã®ç©ã§è¡šçŸ
- äºå€ã®ã±ãŒã¹ logit = ãã€ã¢ã¹ïŒéã¿*çŽ æ§ãP(true|d) = ããžã¹ãã£ãã¯é¢æ°(logit)
- å€å€ã®ã±ãŒã¹ logit â softmax
- softmax â ããžã¹ãã£ãã¯é¢æ°ã®å€å€ãžã®æ¡åŒµ
- åé¡ã ãã§ã¯ãªã確çãåŠç¿
- åŠç¿ïŒå¯Ÿæ°ç¢ºçãæå€§å βã«é¢ãã埮åã¯åžé¢æ°ããã ãéããè§£ã¯ååšããªã
- é·æïŒã·ã³ãã«ãè§£éå¯èœãçŽ æ§éã®ç¬ç«ãä»®å®ããªããçæïŒåŠç¿ãïŒãã€ãŒããã€ãºããïŒé£ãããææ³ããã¶ã€ã³ããå¿ èŠãæ±åããªãå¯èœæ§
-
RNN (ååž°åãã¥ãŒã©ã«ãããã¯ãŒã¯)
- hi ã¯ãi ãŸã§ã®å ¥åãšãi - 1 ãŸã§ã® h ã«äŸå
- i ãŸã§ã®ããã¹ãã®æ å ±ãå«ãã§ãã
- ããã¹ãã®è¡šçŸãã®ãã®ïŒ
- ããã¹ãå šäœãèªãŸããåŸã® h ãåãåºããŠçŽ æ§ã«ããã°ãã
- h ãå¿ èŠãªæ å ±ãå«ãã§ããããšãã©ã®ããã«ä¿èšŒãããïŒ
- æå€±é¢æ°ïŒåºæ¬çã«ã¯ MLP (å€å±€ããŒã»ãããã³) ãšåã â ã¯ãã¹ãšã³ããããŒ
- å€ã¯ã©ã¹åé¡
- ã¯ãã¹ãšã³ããããŒã¯ãã©ãã«ã1ã€ã®å Žå
- æ¹æ³1: è€æ°ã®ïŒå€åé¡åšãèšç·Ž
- è€æ°ã®ç®ç颿°
- èšèªã¢ãã«ã®ç®ç颿°ãšææžåé¡ãåæã«æé©åãã
- ãããããåŠç¿ããåèªã®åã蟌ã¿è¡šçŸã䜿ãããšã
- åæ¹åRNN
- ååãRNNã®æåŸã®é ãç¶æ ïŒåŸãåãRNNã®æåŸã®é ãç¶æ
- ãã ãããã¹ãçæã«ã¯äœ¿ããªã
- çæã¢ãã«ã«ãèå¥ã¢ãã«ã«ããªã
-
éç³»ååãã¥ãŒã©ã«ããã
- ååž°åããã
- æ§æã®åœ¢ã«æœåšç¶æ ãæ§æ
- èªå·±è€ååš (autoencoder) ã®æå€±é¢æ°ãå°å ¥
- ç³ã¿èŸŒã¿ããã
- ç³ã¿èŸŒã¿ ãã£ã«ã¿ãé©çš
- Subsample (maxãªã©)äžéšã®ç»çŽ ã ããæ®ã
- åèª x åã蟌㿠ã®è¡åãå ç»åãšèŠãªã
- å©ç¹ïŒéããBOW ã§ååãè¡åïŒå°ããçªïŒã䜿ãã®ã§ãæ§é ãå°ã䜿ãã
- æ¬ ç¹ïŒé次çã§ã¯ãªããå¯å€é·ã®ããã¹ãã«å¯Ÿããçæã¢ãã«ã¯å°ãé£ãã
- ååž°åããã
| è±èª | æ¥æ¬èª |
|---|---|
| plagiarism | åœçª |
| interpretable | è§£éå¯èœãª |
| reconstruction | åçŸ |
è¬çŸ©6: Nvidia ã® GPU ã䜿ã£ã深局åŠç¿
è¬åž«ïŒJeremy Appleyard (Nvidia)
-
ãªãæ§èœãéèŠã
- èšç·ŽæïŒããå€ãã®ã¢ãŒããã¯ãã£ãå®éšã§ãã
- ãããã¯ã·ã§ã³ïŒãŠãŒã¶ãŒã«ããéãçµæãæç€ºã§ãã
- å šãŠãèªåã§ã¯ãªããæ©æ¢°åŠç¿ã®ç ç©¶è ãç¥ã£ãŠããã¹ãããš
-
ããŒããŠã§ã¢
- CPU â é å»¶ããªãã¹ãå°ããããã«æé©åã倧ããªãã£ãã·ã¥
- GPU â 䞊å床ãéåžžã«é«ããæ°äžã®æŒç®ãåæã«å®è¡ãããã«ããŒã¿ãåž°ã£ãŠããªã
- CPU ãã10å以äžé«ãã¹ã«ãŒããã (gflops)
- ã¡ã¢ãªãŒåž¯åãåãåŸå
-
ã«ãŒãã©ã€ã³ã»ã¢ãã«
- æŒç®åŒ·åºŠ (arithmetic intensity) = flop / ãã€ã â x軞
- æŒç®æ§èœ flop/s â y軞
- ã°ã©ãã«ãããšã屿 ¹ã®ãããªåœ¢ã«ãªã â ã«ãŒãã©ã€ã³
- äŸïŒè¡åä¹ç®ãæŒç®åŒ·åºŠãé«ã
-
RNN (LSTM)
- å€ãã®è¡åä¹ç®
- ãããããå (åæãè¯ããªããããŒããŠã§ã¢äžã§é«éå)
- è¡åä¹ç®ã®å³åŽ (w_t, h_t-1) ã¯å ±é â ïŒã€ã®ä¹ç®ãäžã€ã«ãŸãšããããšãå¯èœ
-
è¡åxè¡åä¹ç® (GEMM - BLAS ã®é¢æ°å)
- èšç®çç¥
- LSTM: flops / ãã€ãã®æ¯ã¯ 2HB:3B+4H â O(n) ã ããH ãš B ã®å€ã«å€§ããäŸå
- ãããã¯ã·ã§ã³ïŒããããµã€ãºã¯ 1 ã§ããããšãå€ã
- ã«ãŒãã©ã€ã³ã»ã¢ãã« (ãããã®ãµã€ãº 察 GFLOP/s)
- ããããµã€ãº = 32 ã 64 ãããã«ãè§ãããã
- 宿ž¬å€ãšçè«å€ã¯ããäžèŽããŠããïŒè§ã®ããã以å€ïŒ
- ãããããåã¯éåžžã«å€§åïŒ
-
ãããã¯ãŒã¯ã¬ãã«ã®æé©å
- ã©ããã£ãŠçµæãå€ããã«é«éåããã
- ã¡ã¢ãªãŒè»¢éãæžãã
- ãªãŒããŒããããæžãã
- 䞊å床ãäžãã
- ã©ããã£ãŠçµæãå€ããã«é«éåããã
-
æé©å1 (ã¡ã¢ãªãŒè»¢é)
- è¡å A (åºå®) ãã¡ã¢ãªã«ããŒãããæéãåæž
- å ¥å w_t ã¯ãäºãã«ç¬ç«
- W*[w_t; h_(t-1)] ããw ã«äŸåããéšåãš h ã«äŸåããéšåã®åã«åè§£
- w ãã°ã«ãŒãå ããããµã€ãºãå¢ããã®ãšåç
- æ°žç¶RNNs â ãªã«ã¬ã³ãè¡åãããããäžã®ã¡ã¢ãªã«ä¿æããŠããé«åºŠãã¯ããã¯
-
æé©å2
- ãªãŒããŒããã
- èŠçŽ ããšã®ç©ãåæŒç®ããšã«ã«ãŒãã«ãèµ·å
- æŒç®ããšã«ã«ãŒãã«ãèµ·åããªããã°ãããªãçç±ã¯ãªã
-
æé©å3
- 䞊å床ãäžãã
- å€å±€RNN
- åçŽãªæ¹æ³ïŒïŒå±€ç®ããã¹ãŠèšç®ã次ã«ïŒå±€ç®ããã¹ãŠèšç®ãetc.
- 代ããã«ãæŸå°ç¶ã«äºãã«äŸåããªãã»ã«ãåæã«èšç®ãã
-
cuDNN
- LSTM ãªã©ã®æšæºçãªé«éåãæäŸãããŠãã
- BLAS, FFT, ä¹±æ°çæ ãªã©ã®ã©ã€ãã©ãªã
-
ãããã«
- æ§èœãæèããããšã¯å€§äº
- ãœãããŠã§ã¢ãšããŒããŠã§ã¢ã®éžæãäž¡æ¹ã圱é¿
| è±èª | æ¥æ¬èª |
|---|---|
| bound | äžé |
| intensity | 匷床 |
| pointwise | (æ¬è¬çŸ©ã§ã¯)èŠçŽ ããšã® (element-wise ãšåã) |
| back-to-back | é£ãåããã« |
è¬çŸ©7: æ¡ä»¶ä»ãèšèªã¢ããªã³ã°
è¬åž«ïŒChris Dyer (DeepMind / ã«ãŒãã®ãŒã¡ãã³å€§)
-
æ¡ä»¶ãç¡ããã®èšèªã¢ãã«
- ããèªåœäžã®æååã«å¯ŸããŠç¢ºçãä»äž
- éå»ã®å±¥æŽããæ¬¡ã®åèªãäºæž¬ããåé¡ã«åçŽå
-
æ¡ä»¶ä»ãèšèªã¢ãã«
- ããæèæ¡ä»¶ x ã®ããšãèšèªãçæ
- x: èè ãw: ãã®èè ã®ããã¹ã
- x: ãã©ã³ã¹èªã®æ, w: 翻蚳ãããè±èªã®æ (確çãèš³ã«ç«ã€ïŒ)
- x: ç»å, w: ç»åã®ãã£ãã·ã§ã³
- x: é³å£°, w: é³å£°ã®æžãèµ·ãã
- x: ææžïŒè³ªå, w: å¿ç
- èšç·ŽããŒã¿ã¯å ¥å, åºåã®ãã¢
-
ã¿ã¹ã¯ã«ãã£ãŠãå©çšã§ããããŒã¿ã®éã倧ããç°ã
- 翻蚳ãèŠçŽããã£ãã·ã§ã³ãé³å£°èªèçã¯æ¯èŒçå€§èŠæš¡ãªããŒã¿
-
ã¢ã«ãŽãªãºã
- æã確çã®é«ã w ãæ¢ãã®ã¯å°é£
- ããŒã ãµãŒã
- è©äŸ¡
- ã¯ãã¹ãšã³ããããŒ, ããŒãã¬ãã·ã㣠(å®è£ ïŒæ®éãè§£éïŒé£ãã)
- ã¿ã¹ã¯äŸåã®è©äŸ¡ äŸïŒç¿»èš³ã®BLEUïŒå®è£ ïŒç°¡åãè§£éïŒæ®éïŒ âãªã¹ã¹ã¡
- 人æè©äŸ¡ïŒå®è£ ïŒé£ãããè§£éïŒç°¡åïŒ
- ããæèæ¡ä»¶ x ã®ããšãèšèªãçæ
-
ãšã³ã³ãŒããŒã»ãã³ãŒããŒã¢ãã«
- å ¥å x ãåºå®é·ãã¯ãã« c ã«ã©ããšã³ã³ãŒãïŒç¬Šå·åïŒããã â åé¡äŸå
- ãã³ãŒãæã«ãã©ããã£ãŠ c ãæ¡ä»¶ãšããŠäœ¿ãã â ããŸãåé¡äŸåã§ã¯ãªã
-
Kalchbrenner and Blunsom 2013
- ãã³ãŒããŒïŒå€å žç㪠RNN ïŒ åæããšã³ã³ãŒããããã® s
- ãšã³ã³ãŒããŒïŒåçŽãªã¢ãã«ïŒåèªåã蟌ã¿ã®å
- å©ç¹ïŒéããããŒã¿éãå°ãªããŠæžã
- æ¬ ç¹: èªé ãèæ ®ããªã éåæçãªèª ("hot dog") ãæ±ããªã
- ããå°ãè³¢ãã¢ãã«ïŒConvolutional sentence model (CSM)
- éãåãããç³ã¿èŸŒã¿å±€
- å©ç¹ïŒå±æçãªçžäºäœçšãæ§æçãªãã®ãæãããã
- æ¬ ç¹ïŒé·ãã®ç°ãªãæãã©ãæ±ãã
-
Sutskever et al. 2014
- ããã«åçŽãªæ§é ããšã³ã³ãŒããŒã»ãã³ãŒããŒäž¡æ¹ LSTM
- ãšã³ã³ãŒãã£ã³ã°ã¯ (c_l, h_l) (泚ïŒc â ã»ã«ç¶æ ïŒh â é ãç¶æ )
- å©ç¹ïŒLSTM ãé·è·é¢ã®äŸåé¢ä¿ãèŠãããã æ¬ ç¹ïŒé ãç¶æ ãå€ãã®æ å ±ãèŠããªããšãããªã
- 工倫
- ãšã³ã³ãŒããšãã³ãŒãéã§æãäžããé åºãéã«ãã
- ã¢ã³ãµã³ãã« (Softmax ã®åã«ãè€æ°åã®ã¢ãã«ã®åºåãå¹³åãã)
- ãã³ãŒãã£ã³ã°
- arg max P(w | x) ã® arg max ãæ£ç¢ºã«èšç®ããã®ã¯é£ãã
- 貪欲æ³ã§ä»£æ¿ çŽåã®åèªãæ£ãããšä»®å®ããŠæ¬¡ã®åèªã«ç§»ã
- ããŒã ãµãŒãïŒäžäœ b åã®ä»®å®ãä¿æããªãããã³ãŒã
-
ç»åãã£ãã·ã§ã³çæ
- ãã¥ãŒã©ã«ãããã¯å šãŠããã¯ãã« â è€æ°ã®ã¢ããªã㣠(æ§åŒ) ã«å¯Ÿå¿ããã®ãç°¡å
- ImageNet ã®èšç·Žæžã¿ã¬ã€ã€ãŒãç»åã®ãšã³ããã£ã³ã°ãšããŠäœ¿çš
- Kiros et al. (2013)
- äžè¿°ã® K&B 2013 ã«é¡äŒŒ
- é ãç¶æ ã®æŽæ°ã®éã«ããšã³ã³ãŒããããç»åãè¶³ãåãããã ã
- ä¹ç®çèšèªã¢ãã« ãã³ãœã« r_(i, j, w) â åè§£ããŠäœã©ã³ã¯è¿äŒŒ
-
質å
- åèªãã©ã翻蚳ããã®ã¯æèäŸåã§ã¯ãªããïŒ
- Yes! ã ããçŸåšãã䜿ãããŠãããã¹ãã»ããã§ã¯æã¯ããªãç¬ç«
- äŒè©±æã®ãããªããè¯ããã¹ãããŒã¿ã§ã¯å¹æããããã
- åèªãã©ã翻蚳ããã®ã¯æèäŸåã§ã¯ãªããïŒ
| è±èª | æ¥æ¬èª |
|---|---|
| unconditional | ç¡æ¡ä»¶ã®/æ¡ä»¶ã®ç¡ã |
| conditional | æ¡ä»¶ä»ãã® |
| transcription | (é³å£°ã®) æžãèµ·ãã |
| intractable | èšç®éçã«å°é£ |
| modulo | ããé€ã㊠|
| modality | æ§åŒ (äŸïŒç»å vs ããã¹ã) |
| architecture | ã¢ãŒããã¯ã㣠(ãã¥ãŒã©ã«ãããã®æ§é ) |
| compositional | (æå³ã) åæç㪠|
| draconian | 極ããŠå³ãã |
è¬çŸ©8: ã¢ãã³ã·ã§ã³ã䜿ã£ãèšèªçæ
è¬åž«ïŒChris Dyer (DeepMind / ã«ãŒãã®ãŒã¡ãã³å€§)
埩ç¿ïŒæ¡ä»¶ä»ãèšèªã¢ãã«
- åé¡ç¹
- ãã¯ãã«ã«ããæ¡ä»¶ä»ã â æãåºå®é·ã®ãã¯ãã«ã«å§çž®
- åŸé ãéåžžã«é·ãè·é¢ãäŒæãããå¿ èŠããã LSTM ãå¿ãã
æ©æ¢°ç¿»èš³ã«ãããã¢ãã³ã·ã§ã³
-
è§£æ³
- åèšèªã®æãè¡åã§è¡šçŸ â 容éåé¡ã解決
- 察象èšèªã®æãè¡åããçæ â äŒæåé¡ã解決
-
æã®è¡å衚çŸ
- è¡åã®åã®æ° = åèªã®æ°
- åçŽãªã¢ãã«ïŒåèªãã¯ãã«ã®é£çµïŒåçŽãããŠèª°ãè«æã«æžããŠããªãïŒ
- ç³ã¿èŸŒã¿ãããïŒGehring et al. (2016) K & B (2013) ã«äŒŒãŠãã
- åæ¹åRNN: Bahdanau et al. (2015) ã«ããæå
- ååãïŒåŸãåã â é£çµ âãè¡å (2n x w)
- 2017幎ã®çŸç¶
- äœç³»çã«æ¯èŒããç ç©¶ã¯ã»ãšãã©ç¡ã
- ç³ã¿èŸŒã¿ãããã¯è峿·±ããããŸãç ç©¶ãããŠãªã
-
è¡åããã®çæ (Bahdanau et al. 2015)
- çææã«ãRNN ã¯ïŒã€ã®æ
å ±ã䜿ã
- çŽåã«çæããåèªã®ãã¯ãã«è¡šçŸ
- å
¥åè¡åã®ããã¥ãŒã
- æéããšã«ãå ¥åè¡åã®éãéšåããæ å ±ãåãåºã
- éã¿ a_t (é·ã = |f|) â ã¢ãã³ã·ã§ã³
- åïŒçæåŽã®ïŒåèªãïŒå ¥ååŽã®ïŒåèªã«ã©ã察å¿ããŠãããè§£éå¯èœ
- ã©ã a_t ãèšç®ããã
- åæå» t ã«ãæåŸ ãããå ¥åãã¯ãã« r_t = V s_(t-1) ãèšç® (V ã¯åŠç¿å¯èœãã©ã¡ãŒã¿)
- â ããããF ã®ååãšã®å ç©ãèšç® â Softmax ã㊠a_t ãåŸã
- (Bahdanau et al. 2015) â å ç©ã MLP ã§çœ®ãæã
- BLEU +11!
- çææã«ãRNN ã¯ïŒã€ã®æ
å ±ã䜿ã
-
ã¢ãã«ã®å€çš®
- Early binding (æ©æçµå)
- Late binding (æ©æçµåïŒ) çŸåšã®æœåšç¶æ
ãšãã¢ãã³ã·ã§ã³ã»ãã¯ãã«ãèæ
®ããŠãåèªãçæ
- é ãããïŒ ã¢ãã³ã·ã§ã³ã鿥çã«ããæœåšç¶æ ã«å¯äžããŠãªã
- èšç·Žæã«ãæœåšç¶æ ãšã¢ãã³ã·ã§ã³ã®èšç®ã䞊ååã§ãã
-
ãŸãšã
- ã¢ãã³ã·ã§ã³ã¯ç³ã¿èŸŒã¿ãããã®ãããŒãªã³ã°ãã«äŒŒãŠã
- ã¢ãã³ã·ã§ã³ãå¯èŠåãããšåèªã®ã¢ã©ã€ã³ã¡ã³ãã芳å¯ã§ãã
- åŸé
ã«ã€ããŠ
- ãã³ãŒããŒãééããå Žåãã¢ãã³ã·ã§ã³ã®åŒ·ãèªã«åŸé ã匷ãäŒæãã
- 翻蚳ãšã¢ãã³ã·ã§ã³
- 人éã翻蚳ããæãæãèšæ¶ããããã§ã¯ãªã â å¿ èŠã«å¿ããŠåæãåç §ãã
ç»åãã£ãã·ã§ã³çæã«ãããã¢ãã³ã·ã§ã³
-
Vinyals et al. 2014
- Sutskever ã®ã¢ãã«ãšåã
- ãã ããç»åã®ãšã³ã³ãŒããŒã¯ç³ã¿èŸŒã¿ããã
-
ã¢ãã³ã·ã§ã³ã¯åœ¹ã«ç«ã€ãïŒ
- Yes!
- ç³ã¿èŸŒã¿ãããã®åç¥èŠéãç³ã¿èŸŒãã ãã¯ãã« â ã¢ãããŒã·ã§ã³ã»ãã¯ãã« a
- ã¢ãã³ã·ã§ã³ã®éã¿ããBahdanau et al. 2014 ã®æ¹æ³ã§èšç®
- Stochastic hard attention (Xu et al. 2015)
- ãœãããªååžã§ã¯ãªããç¥èŠéãäžã€ã«æ±ºããŠãµã³ãã«
- Jensen ã®äžçåŒã䜿ãåçŽå
- MCMC ã䜿ããµã³ããªã³ã°
- æç§åŠç¿ã® REINFORCE
- ã¢ãã³ã·ã§ã³ã®éã¿ â åèªãçæããæã«ã©ãã«æ³šç®ããŠããããå¯èŠåã§ãã
- BLEU ã䜿ã£ãŠç»åãã£ãã·ã§ã³ãè©äŸ¡ããã®ã¯ãæ©æ¢°ç¿»èš³ã«æ¯ã¹ãŠé£ãã
| è±èª | æ¥æ¬èª |
|---|---|
| vehemently | ççã« |
| Vulgar Latin | ä¿ã©ãã³èª |
| receptive field | ç¥èŠé |
| inequality | äžçåŒ |
è¬çŸ©9: é³å£°èªè
è¬åž«ïŒAndrew Senior (DeepMind)
-
é³å£°èªè
- ASR èªåé³å£°èªè é³å£°ã®æ³¢åœ¢âããã¹ã
- TTS ããã¹ãèªã¿äžã ããã¹ãâé³å£°ã®æ³¢åœ¢
-
é¢é£ããåé¡
- èªçºçºè©± vs èªã¿äžã, å€§èŠæš¡èªåœ, éé³ã®ããç°å¢, äœè³æº, èšã, etc.
- TTS
- 話è ç¹å®
- é³å£°åŒ·èª¿
- 鳿ºåé¢
-
é³å£°
- æ°å§ã®å€åã®æ³¢
- 声垯 â 声éã«ããå€èª¿ â 調é³ïŒæ¯é³ïŒ ïŒ æ©æŠãééïŒåé³ïŒ
- é³å£°ã®è¡šçŸ
- 人éã®é³å£°ã¯ ~85 Hz - 8 kHz
- è§£å床 (bits per sample) 1 bit ã§ãçè§£å¯èœ
- ããäœæ¬¡å
ã®ããŒã¿ â é«éããŒãªãšè§£æ (FFT) ããŠåšæ³¢æ°åž¯ããšã®ãšãã«ã®ãŒã«å€æ
- é³å£°ã®åé¡ãç»åèªèã®åé¡ã«å€æïŒ
- ãã ã xè»žïŒæéïŒã¯å¯å€
- FFT ããŸã 次å
ãå€ããã
- ã¡ã«å°ºåºŠ (人éã®èŽåç¹æ§ã«åãããéç·åã¹ã±ãŒã«) ã«å€æãã颿£ãŠã€ã³ããŠã䜿ã£ãŠããŠã³ãµã³ããªã³ã°
- 40次å çšåºŠ
- MFCC (ã¡ã«åšæ³¢æ°ã±ãã¹ãã©ã ä¿æ°)
- ã¡ã«å°ºåºŠã®ãã£ã«ã¿ãã³ã¯ããåŸãããå€ã颿£ã³ãµã€ã³å€æ (äž»æååæã«é¡äŒŒ)
- é£ç¶ãããã¬ãŒã éã§ç©ã¿éã
- é³å£°èªèã®æŽå²
- 1960幎代ïŒDynamic Time Warping (ãã³ãã¬ãŒãã䌞瞮ããŠãããã³ã°)
- 1970幎代ïŒé ããã«ã³ãã¢ãã«
- 1995-ïŒã¬ãŠã¹æ··åã¢ãã«ãäž»æµ
- 2006-ïŒãã¥ãŒã©ã«ãããã¯ãŒã¯
- 2012-ïŒRNN
- ã³ãã¥ãã±ãŒã·ã§ã³ãšããŠã®é³å£°
- é³çŽ (phoneme) - åèªã»æå³ãåºå¥ããæå°ã®åäœ è¡šèšïŒIPA/X-SAMPA
- é»åŸ(prosody) - ãªãºã ãã¢ã¯ã»ã³ããã€ã³ãããŒã·ã§ã³ãªã©ãèªèããç ç©¶ã¯å€ãããããŸã䜿ãããªã
- ããŒã¿ã»ãã
- TIMIT (å°ãããé³çŽ å¢çã人æã§ä»äž)
- Wall Street Journal èªã¿äžã
- ...
- Google voice search
- å®éã®ãŠãŒã¶ãŒã®çºè©±
- ïŒå¹Žéã ãä¿åããã®åŸç Žæ£
- DeepSpeech
- çºè©±è ãããããã©ã³ã§éé³ãèããªããçºè©± â çºè©±ã«åœ±é¿
- ãã®äžã«éé³ãä»äž
- 確ççé³å£°èªè
- å ¥å o (observation; 芳å¯) ããã æãå°€ããããåèªç³»å w ãæ±ããã
- HMM ç¶æ ïŒé³çŽ ãåºåïŒMFCCãªã©ã®ãã¯ãã«
- é³ã®åäœ
- æèéäŸåã®HMM ç¶æ (èªé ã»èªäžã»èªæ«ã®ïŒç¶æ ãå¥ã ã«ã¢ãã«å)
- æèäŸåã®HMM ç¶æ (äŸ: "cat" ã® /k/ ãš "cone" ã® /k/ ãéã)
- diphone (é³çŽ ã®ãã€ã°ã©ã )
- é³ç¯
- åèªå šäœ (YouTube é³å£°èªèã®è«æ)
- graphemes (æå) - åèªâçºé³ã®èŸæžãæããªããŠãè¯ã
- è±èªã§ã¯æ®é åèª<->çºé³ã®å¯Ÿå¿ã(é³å£°èªèã®å¯äœçšãšããŠ)åŠç¿ (äŸïŒ"ough")
- ä»ã®èšèªïŒã€ã¿ãªã¢èªã»ãã«ã³èªïŒã§ã¯ç¶Žããé³ã«äžèŽ
- æèäŸåã®é³çŽ ã¯ã©ã¹ã¿ãªã³ã°
- é³çŽ ã®ãã©ã€ã°ã©ã ãèãããšã3 x 42^3 ã®çµã¿åãã
- ãã®ã»ãšãã©ã¯èµ·ãããªã â ã¯ã©ã¹ã¿ãªã³ã°
- é³å£°èªèã®åºæ¬åŒ
- w^ = arg max P(w | o) = arg max P(o | w)P(w)
- P(o|w) ... é³é¿ã¢ãã«ã¹ã³ã¢, P(w) ... èšèªã¢ãã«ã¹ã³ã¢
- èšèªã¢ãã« chain rule ã䜿ã£ã n-gram ãã³ãŒãã£ã³ã°æã«æã©ã³ãã³ã°
- n-gram èšèªã¢ãã«ãš LSTMçã®èå¥çèšèªã¢ãã«ãçµã¿åããããšè¯ãçµæ
- 倿ãšããŠã®é³å£°èªè
- é³å£°âãã¬ãŒã âç¶æ âé³çŽ âåèªâæâæå³
- éã¿ä»ãæéç¶æ ãã©ã³ã¹ãã¥ãŒãµãŒ(WFST) (é³çŽ âåèªãžã®å€æ)
- åèªâæãžå€æãã WFST ãšåæ
- ã¬ãŠã¹æ··åã¢ãã« (é³é¿ã¢ãã«)
- 1990ã2010 ã®äž»æµã¢ãã«
- è€æ°ã®ã¬ãŠã¹ååžã®éã¿ä»ãåãåååžã®å¹³åãšåæ£(å¯Ÿè§æåã®ã¿)ãåŠç¿
- EMã¢ã«ãŽãªãºã ã«ãã£ãŠåŠç¿ M:匷å¶ã¢ã©ã€ã³ã¡ã³ã, E:ãã©ã¡ãŒã¿ã®æšå®
- ãšãŠã䞊ååããããããããŒã¿ãå¹ççã«å©çšã§ããªã (1ãã¬ãŒã â1é³çŽ )
- 匷å¶ã¢ã©ã€ã³ã¡ã³ã
- ãã¿ãã¢ã«ãŽãªãºã ã䜿ããèšç·ŽããŒã¿ã«ãããŠãçŽ æ§ãšé³çŽ ç¶æ ãæå°€ã¢ã©ã€ã³ã¡ã³ã
- ãã³ãŒãã£ã³ã°
- èªèæã¯ãè¡åã®ãããã«ã°ã©ãã«ãªãïŒèªã®éã®ç©ºçœãè€æ°ã®å¯èœæ§ãetc.ïŒ
- ããŒã ãµãŒã (ã¹ã³ã¢ã®é«ã top-n çµè·¯ã ããæ®ã)
-
ãã¥ãŒã©ã«ãããã¯ãŒã¯ãçšããé³å£°èªè
- çŽ æ§ãèšç® or 確çãèšç®
- çŽ æ§ãèšç®
- éåžžã®ååããããã¯ãŒã¯ãããã«ããã¯å±€ã®å€ãçŽ æ§ãšäœ¿ã
- å ã®ã¬ãŠã¹æ··åã¢ãã«ãšåãããŠïŒé£çµããŠïŒäœ¿ã
- ãã€ããªããã»ããã
- é³çŽ ã®åé¡åšãšããŠNNãåŠç¿
- P(o | c) ã GMM ã§ã¯ãªãNNã§ã¢ãã«å
- èšèªã¢ãã«ãšé³é¿ã¢ãã«ãéã¿ã§èª¿æŽãããšè¯ãçµæ
- ç³ã¿èŸŒã¿åŒããã (CNN)
- æŽå²ã¯é·ã
- WaveNet ïŒé³å£°åæïŒã§ã䜿ããã
- æé軞äžã® pooling ã¯è¯ããªãïŒæéæ å ±ãæšãŠãŠããŸãïŒåšæ³¢æ°ã®é åã§ã¯ OK
- ç¹°ãè¿ãåããã (RNN)
- RNN, LSTM, ..
- CLDNN (Sainath et al., 2015a) - CNN + LSTM + DNN
- GRU (DeepSpeech)
- åæ¹åã¢ãã«ã§æ§èœã¯äžããããé å»¶ãçãã ïŒçºè©±ã®çµãããŸã§åŸ ããªããšãããªãïŒ
- ãã¯ããã¯ïŒçºè©±ãçµãã£ããã©ãã確信ã®ç¡ã段éã§ãWebæ€çŽ¢ãéå§ â äœé å»¶ãå®çŸ
- Switchboard (å€§èŠæš¡ã»é»è©±ã»èªçºçºè©±ã³ãŒãã¹)ã§äººéã«å¹æµ (Xiong et al., 2016)
- BLSTMã®ã¢ã³ãµã³ãã«
- i-vector ã§è©±è ãæ£èŠå
- CTC (Connectionist Temporal Classification) â ãã¯ããã¯ã®éå
- é³çŽ ã®éã«ç¡é³ã·ã³ãã«ãæ¿å ¥
- ç¶ç¶çã¢ã©ã€ã³ã¡ã³ã
- Sequence discriminative training
- Cross entropy ã¯ãæ£ããã¯ã©ã¹ã®ç¢ºçãæå€§åãã
- æ¬åœã«æå°åãããã®ã¯ WER (åèªèª€ãç) â ãããšè¿ãã埮åå¯èœãªæå€±é¢æ°ã䜿ã
- èšç·Žæã«ãã³ãŒãã£ã³ã°(èšèªã¢ãã«ãå«ã)ãããééããéšåãšæ£è§£ãšã®å·®ãå¢ãã
- WER ã 15ïŒ åæž
- seq2seq
- åºæ¬ç㪠seq2seq ã¯é³å£°èªèã«åããŠãªããçºè©±ã¯é·ãããïŒæ©æ¢°ç¿»èš³çãšç°ãªããå調 (monotone)
- ã¢ãã³ã·ã§ã³ã¯åããŠã Attention + seq2seq (Chorowski et al. 2015)
- Listen, Attend, Spell (Chen et al., 2015)
- å¥ã ã«åŠç¿ããèšèªã¢ãã«ãçµ±åããã®ãé£ãã
- Watch Listen, Attend, Spell (Chung et al., 2016) é³å£°èªèïŒãããªããèªå é³å£°ã ããããè¯ãããããªã ãã§ãé³å£°èªèã§ãã (WER = 15%)
- Neural transducer - ã¢ãã³ã·ã§ã³ã¯ç³»åå šäœãèŠãªããã°ãªããªãããã£ã³ã¯æ¯ã«èªèããããšã§è§£æ±º
| è±èª | æ¥æ¬èª |
|---|---|
| peculiarity | ç¹ç°ãªç¹ |
| nominal | åç®äžã® |
| babble | ã¬ã€ã¬ã€ãšãã話ã声 |
| Zulu | ãºãŒã«ãŒèª |
| prosody | é»åŸ |
| acoustic | é³é¿ç㪠|
| vocal tract | 声é |
| arbitrarily | ä»»æã« |
| perceptive | ç¥èŠç㪠|
| Polish [pouliÊ] | ããŒã©ã³ã(人ã®/èª) |
| polish [pÉliÊ] | 磚ã |
| lexicon | èŸæž |
| precursor | å é§ã |
| monotonic | å調 |
è¬çŸ©10: é³å£°åæ
è¬åž«ïŒAndrew Senior (DeepMind)
é³å£°èªèïŒç¶ãïŒ
-
End-to-end ã¢ãã«
- çã®é³å£°ããŒã¿ããããã¹ããåºåããã¢ãã«ãçŽæ¥åŠç¿ãã
- é³çŽ ã®ä»£ããã«æå/åèªãåºå
- çŽ æ§ã®èšç®ãåçŽåã(MCFF/log-Mel ãªã©ã®äººæã§äœãããçŽ æ§ã«é Œããªã)
- äŸåé¢ä¿ã®è·é¢ãé·ããªã
- Clockwork RNN (Koutnik et al, 2014) åšæã®ç°ãªãéå±€çãªè€æ°ã®RNN
-
ãã£ã«ã¿ã®åŠç¿
- ç¹æ§åšæ³¢æ°ã®ããŒã¯ãäœãåšæ³¢æ°ããé ã«ãããã
- ããäœãåšæ³¢æ°åž¯ãã«ããŒãããã£ã«ã¿ãå€ã
-
éé³ã®ããç°å¢äžã§ã®é³å£°èªè
- ãã€ãºã人工çã«åæ
- Google ã§ã¯ãYouTube ã®ãããªããã¹ããŒã以å€ã®éšåãæœåºããŠåæ
- éšå±ã·ãã¥ã¬ãŒã¿ãŒ
- denoiser (ãã€ãºé€å»åš) ããã«ãã¿ã¹ã¯çã«åŠç¿ â æåæžãèµ·ããã®ç¡ãé³å£°ããŒã¿ã䜿ãã
- éé³ã®ããç°å¢äžã§ã¯ã人éã®è©±ãæ¹ãå€ãã
-
è€æ°ãã€ã¯ã®é³å£°èªè
- ç ç©¶ã¬ãã«ã§ã®æŽå²ã¯é·ãããã¹ãããAmazon Echo ãªã©ã®è€æ°ãã€ã¯ããã€ã¹ãæ®åããã«ã€ããŠéèŠã«
- ããŒã ãã©ãŒãã³ã° (Beamforming) ãã€ã¯ã®æåæ§ãé«ãã
é³å£°åæ
-
é³å£°åæãšã¯
- ããã¹ãããé³å£°æ³¢åœ¢
- é³å£°çæã®éçš
- 声垯ã»å£°é â å£ã§å€èª¿
-
é³å£°çæã®æµã
- ããã¹ãè§£æ
- äŸïŒæåå²ãåèªåå²ãåè©è§£æãããã¹ãæšæºå, etc.
- äŸïŒ429 㯠four-hundred-twenty-nine ã four-twenty-nine ã
- 颿£ â 颿£ç (NLP)
- é³å£°çæ
- 颿£ â é£ç¶ç
- ããã¹ãè§£æ
-
é³å£°çæ
- ã«ãŒã«ã«åºã¥ãããã©ã«ãã³ãåæ
- ãµã³ãã«ã«åºã¥ããé£çµåæ â éåãéåãã®é³ãäžèªç¶
- ãã¬ãŒãºåæ (äŸïŒé§ ã®ã¢ããŠã³ã¹) â ãæ¬¡ã®åè»ã¯ãïŒå°åïŒãè¡ãã§ããã
- ã¢ãã«ã«åºã¥ããçæçåæ
-
é£çµåæ
- 倿§æ§ã®ããããŒã¿ããŒã¹ãäœæ
- diphones ãã«ããŒ
- ç°¡åãªé³å£°èªèã·ã¹ãã ã䜿ãã匷å¶ã¢ã©ã€ã³ã¡ã³ã â diphone ã®å¢çãç¹å®
- ã³ã¹ãïŒãµã³ãã«ãšæãåºåã®è·é¢ïŒãæå°å
-
ããŒã¿ããŒã¹ã®äœæ
- ã¹ã¿ãžãªé²é³
- äžè²«ããç°å¢
- èæ¯éé³ç¡ã
- ããã®åäžè©±è ãã倧éã®ãªãŒãã£ãªãé²é³
- èªã¿äžãé³å£°ïŒèªçºçºè©±ã§ã¯ãªãïŒ
- ããŒã¿ã»ãã
- VCKT (Voice Cloning Tool Kit)
- éè¡æ§çŸæ£æ£è ãããããããèªåã®å£°ãé²é³â話ããªããªã£ãŠãããèªåã®å£°ã§é³å£°åæ
- æ±çšçãªã¢ãã«ãèšç·Žããèªåã®å£°ã«é©å¿
- Merlin
- ãªãŒãã³ãœãŒã¹ã®é³å£°åæã·ã¹ãã
- VCKT (Voice Cloning Tool Kit)
-
é³å£°åæã®è©äŸ¡
- é³å£°èªèã¯ç°¡å â åèªèª€ãç
- é³å£°åæ â 䞻芳ç
- 客芳çãªææš (äŸ: é§ ã®ã¢ããŠã³ã¹)ïŒèããŠåãããïŒä»ãããŸãæå³ããªããªã
- Mean Opinion Scale (0 ãã 5ã®å°ºåºŠ)
- A/B éžå¥œãã¹ã ã©ã¡ããããè¯ãã
- 客芳çãªææš
- PESQ
- ããã¹ã㪠MOS
- Blizzard Competition
- é³å£°åæã®ã³ã³ããã£ã·ã§ã³
-
TTSã®ç¢ºççãªå®åŒå
- p(x | w, X, W)
- X: é³å£°æ³¢åœ¢, W: æåæžãèµ·ãã, w: å ¥åããã¹ã, x: åºå波圢
- è£å©å€æ°
- o: é³é¿çŽ æ§, l: èšèªçŽ æ§, λ: ã¢ãã«
- è¿äŒŒ1ïŒç¹æšå®
- è¿äŒŒ2: ã¹ãããæ¯ã®æå€§å
- èšèªççŽ æ§
- æ (é·ã)ãå¥ (ææ)ãåèª (åè©)ãé³ç¯ (匷調ã声調)ãé³çŽ (æå£°/ç¡å£°)
- æç¶æéã¢ãã«
- åé³çŽ ãã©ã®ãããã®æéç¶ç¶ããããå¥ã«ã¢ãã«å
- Vocoder
- voice decoder/encoder 声ã®åæ
- p(x | w, X, W)
çæçé³é¿ã¢ãã«
- HMM
- ã¢ã©ã€ã³ã¡ã³ãã»ã¢ãã«ãšåæ§
- åç¶æ ã«å¯ŸããŠãåºåãã¯ãã«ã®å¹³åã»åæ£ãèšç®ãçæã®éã«å©çšãã
- å€ãã®æ å ±ãå¹³åãããŠããã®ã§ããããã£ã声ã«ãªã
- åé¡ïŒ
- ã¹ã ãŒãºã§ã¯ãªã
- 髿¬¡å ã®é³é¿çŽ æ§ãæ±ãã«ãã
- ããŒã¿ã®æçå
-
ãã¥ãŒã©ã«ããã
- ã¬ãŠã·ã¢ã³ååžã®ä»£ããã«ãã¥ãŒã©ã«ãããã䜿çš
- ãã¬ãŒã éã§åºå®ã§ãªããŠãè¯ã â ããã¹ã ãŒãºãªè»¢ç§»
- é ãå±€ã«ãªã«ã¬ã³ãæ¥ç¶ãå ¥ãããšæ§èœãåäž
- 髿¬¡å ã®çŽ æ§ãã¢ãã«åå¯èœïŒçã¹ãã¯ãã«ããïŒ
- çŸåšã§ã¯ãç ç©¶ïŒè£œåã®äž»æµ
-
End-to-End ã·ã¹ãã
- Audo-encoder ã䜿ããäœæ¬¡å ã®é³é¿çŽ æ§ãåŠç¿ â è¯ãçµæ
- Source-filter ã¢ãã«ãšé³é¿ã¢ãã«ãåæã«æé©å
- WaveNet
- çé³å£°ã®çæã¢ãã«
- Pixel RNN, Pixel CNN ã®ã¢ãã«ã«é¡äŒŒ
- èªå·±ååž°ã¢ãã«
- ç³ã¿èŸŒã¿ãããã§ã¢ãã«å (Casual dilated convolution) é·è·é¢ã®äŸåé¢ä¿ãæãããã
- åºåã« softmax (ååž°ã§ã¯ãªãåé¡)
- åã«ãµã³ããªã³ã°ãããšå質ãèœã¡ãã®ã§ãÎŒ-lawã¢ã«ãŽãªãºã ãæåã«é©çš
- "Dan Jurafsky" ãä»ã§ã¯ãé³å£°åæã¯èšèªã¢ãã«ãšåãåé¡ã ã
- ãã€ãžã¢ã³ End-to-End
- ç©åãã¢ã³ãµã³ãã«ã§è¿äŒŒ
- WaveNet ã§ã¯ãããŠé£çµæ¹åŒãè¶
ãã
- èªç¶ããã¢ãã«ã®æè»æ§ã§ã¯çæã¢ãã«ã®ã»ããé£çµæ¹åŒãããäž
-
課é¡
- æèäŸåæ§ â çãã«ãªãåèªã匷調
- é³å£°åæãšé³å£°èªèãäžã€ã®ã·ã¹ãã ãšããŠèšç·Ž
| è±èª | æ¥æ¬èª |
|---|---|
| filterbank | ãã£ã«ã¿ãã³ã¯ (ãã£ã«ã¿ã®éå) |
| modulate | å€èª¿ãã |
| fricative | æ©æŠé³ |
| click | ããããé³ |
| degenerative | éè¡æ§ |
| intelligible | çè§£ã§ãã |
| muffled | é³ããããã£ã |
| Houston | ããŠã¹ãã³ã(ãã¥ãŒãšãŒã¯ã®éã) |
| Houston | ãã¥ãŒã¹ãã³ (ãããµã¹ã®éœåž) |
è¬çŸ©11: 質åå¿ç
è¬åž«ïŒKarl Moritz Hermann (DeepMind)
-
ãªã質åå¿çãéèŠã
- 質åå¿ç㯠AI å®å
š
- QA ãè§£ããã°ä»ã®åé¡ãè§£ãã
- å€ãã®å¿çš (æ€çŽ¢ãå¯Ÿè©±ãæ å ±æœåº, ...)
- æè¿ã®è¯ãçµæ (äŸïŒIBM Watson Jeopardy!)
- å€ãã®èª²é¡ïŒæ¯èŒç容æãªãã®ãå«ãïŒ
- 質åå¿ç㯠AI å®å
š
-
質åå¿çã¯ïŒçš®é¡ã®ããŒã¿ã«äŸå
- 質å
- æè/ãœãŒã¹(åºå ž)
- å¿ç (ããèªäœã質åã§ããããšã)
-
質åã®åé¡
- 5W1H
- 質åã®äž»èª
- äºæž¬ãããå¿çã®çš®é¡
- å¿çãåŒãåºãåºå žã®çš®é¡
-
ãå¿çãã«ãŸã泚ç®ãQAã·ã¹ãã ãäœãéã«ã¯
- å¿çã¯ã©ããã£ã圢åŒã
- ã©ãããå¿çãåŒãåºããŠããã
- èšç·ŽããŒã¿ã¯ã©ããã£ã圢åŒã
ããŸãèããã®ãæçšã
- 質åå¿çã®çš®é¡
- èªè§£çè§£
- æå³è§£æ
- ç»å質åå¿ç
- æ å ±æ€çŽ¢
- 峿žåç §
æå³è§£æ (Semantic Parsing)
- èªç¶èšèªãæå³ã®åœ¢åŒè¡šçŸã«å€æ â è«ç衚çŸã䜿ã£ãŠããŒã¿ããŒã¹ãæ€çŽ¢
- ç¥èããŒã¹
- ïŒã€çµã§ç¥èãæ ŒçŽ (é¢ä¿, ãšã³ãã£ãã£1, ãšã³ãã£ãã£2)
- èªç±ã§å©çšå¯èœãªç¥èããŒã¹ (FreeBase, WikiData, etc.)
-
ç¥èããŒã¹ã¯ç°¡åã«å©çšã§ããããèšç·ŽããŒã¿ãå ¥æããã®ã¯å€§å€
- èšèªãè«ç衚çŸã«å€æããããã®èšç·Žãçµã人ããã§ããªã (Amazon Mechanical Turk ã䜿ããªã)
-
深局åŠç¿ã«ããã¢ãããŒã
- æ©æ¢°ç¿»èš³ãšåãã¢ãã«ïŒç³»å倿ïŒ
- åé¡ç¹ïŒèšç·ŽããŒã¿ãå°ãªããç®çèšèªïŒè«ç衚çŸïŒãè€éãåºæåè©ãæ°åã®æ±ããé£ãã
- 解決çïŒè«ç衚çŸã«é Œããªããè«ç衚çŸãæœåšçãªãã®ãšããŠæ±ãã質åâå¿çãçŽæ¥åŠç¿
- æ¹åææ³ïŒã¢ãã³ã·ã§ã³ã䜿ããç®çèšèªåŽã§ã®å¶çŽã䜿ããåæåž«ããåŠç¿ã䜿ã
- è€æ°ã®ãœãŒã¹ããã®çæ
- "Pointer Networks" ã®å©çš
- ããŒã¿ããŒã¹ãåç §
èªè§£çè§£
- æ°èèšäºïŒè³ªåïŒç©Žåã圢åŒïŒ â å¿ç
- å€§èŠæš¡ã³ãŒãã¹ (CNN/DailyMail, CBT, SQuAD)
- åæïŒèšç·Žæã«ã¯åºå žãèŠãªããçãã¯åºå žã®äžã«åèªããã¬ãŒãºã®åœ¢ã§å«ãŸãã
- åºæåè©ãš OOV ãå¿åã®ããŒã«ãŒã«çœ®æ
- èªåœãµã€ãºã®åæžãèšç·Žããã¢ãã«ã®äžè¬å
- ãã¥ãŒã©ã«ã¢ãã«ïŒP(a|q, d) ãã¢ãã«å
- d â åæ¹å LSTM, q â åæ¹å LSTM, åæ
- ã¢ãã³ã·ã§ã³ãå©çšããèªè§£çè§£ â æå» t ããšã®ææždã®è¡šçŸãæ±ããã¯ãšãªè¡šçŸãšåæ
- Attention Sum Reader: å¿çã¯ææžã®äžã«å«ãŸããŠãããšããäºå®ã䜿ããçããäœçœ® i ã«ãã確çããã®ãŸãŸã¢ãã«å
å¿çæéžæ
- 質åã«å¯Ÿããå¿çãšããŠäœ¿ããæãã³ãŒãã¹ïŒãŠã§ãå šäœïŒéžã¶
- ããŒã¿ã»ããïŒTREC QA Track, MS MARCO
- ãã¥ãŒã©ã«ã¢ãã«ïŒå¿çåè£ a ã¯ãšãª q ã«å¯ŸããŠãsigmoid(q^T M a + b) ãèšç®
- è©äŸ¡ïŒç²ŸåºŠãMRR (å¹³åéé äœ)ãBLEU
ç»åQA
- ããŒã¿ã»ã»ãã (VisualQA, VQA 2.0, COCO-QA)
- ãã¥ãŒã©ã«ã¢ãã«ïŒè³ªåâäœããã®ãšã³ã³ãŒã, ç»åâç³ã¿èŸŒã¿ããã
- ãç²ç®ã¢ãã«ãïŒç»åãèŠãªãïŒã§ãããããäžæãè¡ã
-
èªè§£çè§£ãšã¿ã¹ã¯ã䌌ãŠãã â ã¢ãã³ã·ã§ã³ã®ãããªåæ§ã®ãã¯ããã¯ã䜿ãã
-
ãŸãšãïŒQAã·ã¹ãã ã®ã€ãããã
- ã¿ã¹ã¯ã¯äœã
- 質åãå¿çãæèã¯ã©ããªãã®ã
- ããŒã¿ã¯ã©ãããæ¥ãã
- ããŒã¿ãè£å ã§ããã
- 質åãšæèãã©ããšã³ã³ãŒãããã
- 質åãšæèãã©ãçµã¿åãããã
- çããã©ãäºæž¬ã»çæããã
| è±èª | æ¥æ¬èª |
|---|---|
| low hanging fruit | ç°¡åã«è§£æ±ºã§ããåé¡ |
| MET office | ã€ã®ãªã¹æ°è±¡åº |
| factoid | (ã€ã®ãªã¹è±èª) ç䌌äºå® |
| defunct | æ©èœããŠããªã |
| grounded | åºåºç (çåœå€ãæ±ãããã) |
| extrapolate | 倿¿ãã |
| anonymize | å¿ååãã |
| garbage in, garbage out | 質ã®äœãå ¥åããã¯è³ªã®äœãåºåããçãŸããªãããš |
è¬çŸ©12: èšæ¶
è¬åž«: Ed Grefenstette (DeepMind)
-
RNN 埩ç¿
- RNN å ¥åâåºå + é ãç¶æ ht -> ht+1 ã«æŽæ°
- é·è·é¢ã®äŸåãæ±ãã
- å€ãã®NLPã¿ã¹ã¯ãã倿ã¿ã¹ã¯ãšããŠæãããã (æ§æè§£æã翻蚳ãèšç®)
- Learning to Execute â Python ããã°ã©ã ãïŒæåãã€èªã¿èŸŒã¿ãå®è¡çµæãïŒæåãã€åºå
- 泚ïŒè©äŸ¡æã«ããæ£ãããç³»åãå ¥åããŠæ¬¡ã®æåãäºæž¬ã
- Learning to Execute â Python ããã°ã©ã ãïŒæåãã€èªã¿èŸŒã¿ãå®è¡çµæãïŒæåãã€åºå
- èšç®ã®éå±€
- æéç¶æ æ©æ¢°ïŒæ£èŠèšèªïŒ â ããã·ã¥ããŠã³ã»ãªãŒãããã³ (æèèªç±èšèª) â ãã¥ãŒãªã³ã°ãã·ã³ (èšç®å¯èœãªé¢æ°)
-
倿ã¢ãã«ã®ããã«ããã¯
- 容éãå¯å€ã§ã¯ãªã
- åæã®å šãŠã®æ å ±ãé ãç¶æ ã«ä¿æããªããšãããªã
- 察象èšèªã®ã¢ããªã³ã°ã«å€§éšåã®æéãããã
- ãšã³ã³ãŒããŒã«äŒããåŸé ãå°ãã
- 容éãå¯å€ã§ã¯ãªã
-
RNNã®éç
- ãã¥ãŒãªã³ã°ãã·ã³ â RNNã«å€æåããã ãåŠç¿å¯èœãšã¯éããªã
- åçŽãª RNN ã¯ããã¥ãŒãªã³ã°ãã·ã³ãåŠç¿ã§ããªã
- RNN ã¯ãæéç¶æ æ©æ¢°ã®è¿äŒŒ
- RNN ã®ç¶æ ã¯ãã³ã³ãããŒã©ãŒãšèšæ¶äž¡æ¹ã®åœ¹å²
- é·è·é¢ã®äŸåé¢ä¿ã¯ããã倧容éã®èšæ¶ãå¿ èŠ
- æéç¶æ æ©æ¢°ã¯ãããããéçããã
-
RNN åè
- APIã®èŠç¹ããèããïŒåã®ç¶æ ïŒå ¥å â æ¬¡ã®ç¶æ ïŒåºåïŒ vanilla RNN ã LSTM ãåã
- ã³ã³ãããŒã©ãŒãšèšæ¶ãåãã
-
ã¢ãã³ã·ã§ã³
- ããããŒã¿ã衚çŸãããã¯ãã«ã®é å
- å ¥åºåããžãã¯ãå¶åŸ¡ããã³ã³ãããŒã©ãŒ
- åæå»ã§èšæ¶ãèªã
- èšæ¶ã«åŸé ãèç©
- Early Fusion vs Late Fusion â èšæ¶ããèªã¿èŸŒãã ããŒã¿ãå ¥åã«çµ±åããããåºåã«çµ±åããã
- ãšã³ã³ãŒããŒã»ãã³ãŒããŒã¢ãã«ã®ããã®ROM
- ãšã³ã³ãŒããŒã«åŸé ãå¥ã®çµè·¯ã§äŒãã â ããã«ããã¯ã®åé¿
- ãœããã¢ã©ã€ã³ã¡ã³ãã®åŠç¿
- é·ãç³»åã®äžã§æ å ±ãèŠã€ãããã
- èšæ¶ãåºå®
- ããã¹ã嫿
- Premise (åæ) ãš Hypothesis (仮説) â ççŸ/äžç«/嫿
- åçŽãªã¢ãã«ïŒåæïŒä»®èª¬ã« RNN + ã¢ãã³ã·ã§ã³ãé©çš
- èªè§£çè§£
-
ã¬ãžã¹ã¿æ©æ¢°
- ã³ã³ãããŒã©ãŒã RAM ã«åœ±é¿ãåãŒã
- ã³ã³ãããŒã©ãŒãïŒèšæ¶ã¢ã¯ã»ã¹çšã®ïŒããŒãçæ â ã¢ãã³ã·ã§ã³ã®ããã«äœ¿ã
- ãã¥ãŒãªã³ã°ãã·ã³ãšã®é¢ä¿ïŒäžè¬çãªã¢ã«ãŽãªãºã ãåŠç¿ããã®ã¯é£ããããç¹å®ã®æ¡ä»¶äžã§ã¯å¯èœ (e.g., Graves et al. 2014)
- è€éãªæšè«ã«ã¯ãRNN+ã¢ãã³ã·ã§ã³ä»¥äžã«è€éã§è¡šçŸåã®é«ããã®ãå¿ èŠ
-
ãã¥ãŒã©ã«ã»ããã·ã¥ããŠã³ã»ãªãŒãããã³
- ãã¥ãŒã©ã«ã»ã¹ã¿ãã¯
- ã³ã³ãããŒã©ãŒããpush/popåäœ + ããŒã¿ã決ãã
- é£ç¶å€ã¹ã¿ã㯠åãã¯ãã«ïŒããŒã¿ïŒã«ç¢ºä¿¡åºŠãä»äžãpush/pop ã¯ç¢ºä¿¡åºŠãå ç®/æžç®
- 人工ã¿ã¹ã¯ïŒå€ã®ã³ããŒãå転
- èšèªã¿ã¹ã¯ïŒSVO ãã SOV ãžã®å€æ æ§ã®ç¡ãèšèªããæãèšèªãžã®å€æ
- LSTM ã§ãåæããããåŠç¿ãé ãïŒæ£èŠèšèªã®è¿äŒŒãåŠç¿ããŠããïŒ
- Stack, Queue, DeQueue, ããããåŸæãªåé¡ãéã
| è±èª | æ¥æ¬èª |
|---|---|
| esoteric | é£è§£ãª |
| with a grain of salt | 話ååã« |
| hierarchy | éå±€ |
| premise | åæ |
| hypothesis | 仮説 |
è¬çŸ©13: ãã¥ãŒã©ã«ãããã«ãããèšèªç¥è
è¬åž«: Chris Dyer
-
èšèªåŠãšã¯
- èšèªã§ã©ãæå³ã衚çŸããã
- è³ãã©ãèšèªãåŠçã»çæããã
- 人éã®èšèªã«ã¯ã©ããããã®ãå¯èœã
- 人éã®åäŸã¯ãå°ãªãããŒã¿ããã©ãèšèªãåŠã¶ã
-
æã®éå±€é¢ä¿
- not ãš anybody ã®éã«NPI (negative polarity item; åŠå®æ¥µæ§é ç®) ã®é¢ä¿
- "not" 㯠anybody ãããå ã«æ¥ãªããšãããªã (æšã®èŠª)
- 仮説: åäŸãèšèªãç°¡åã«åŠã¶ã®ã¯ãæ§é çã«ããããã«æå³ããªããªã仮説ãèããŠããªããã
-
RNN
- RNN ã¯éåžžã«åŒ·åãªã¢ãã«
- ãã¥ãŒãªã³ã°å®å š
- ã©ããã£ãåž°çŽçãã€ã¢ã¹ãããã
- ã©ããã£ãéçšã眮ãã
-
RNN ã®åž°çŽçãã€ã¢ã¹
- é£ããåé¡
- ç³»åçæ°è¿æ§ãåªå
- 蚌æ ïŒåŸé ãç³»åã®å転ãªã©ã®å®éšãã¢ãã³ã·ã§ã³
- ãã§ã ã¹ããŒã人éã®èšèªãå¹ççã«åŠã¶ããã«ã¯ãç³»åçæ°è¿æ§ã¯è¯ããã€ã¢ã¹ã§ã¯ãªãã
-
èšèªã¯ã©ãæå³ã衚çŸããã
- "This film is hardly a treat."
- åŠå® "hardly" ãå ¥ã
- Bag-of-words ã¢ãã«ã§ã¯é£ãã
- ç³»åç RNN ã§ãããããããŸããã
- åæã®ååïŒè¡šçŸã®æå³ã¯ãåå¥ã®æ§æèŠçŽ ã®æå³ãšãããåæããèŠåããæã
- çµ±èªè«ã®æšè¡šçŸ â æ§ææš
-
ååž°çãã¥ãŒã©ã«ãããã¯ãŒã¯
- æ§ææšã®åç¯ç¹ã«é ãç¶æ h = tanh(W[l;r] + b) ãšããŠååž°çã«æ§æãã
- ç¯ç¹ã®çš®é¡(åè©, åè©, etc.)ã«ãã£ãŠåæã«ãŒã«ãå€ãã â W ãå€ãã
- æ ç»ã®ã¬ãã¥ãŒãæ§æè§£æããåç¯ç¹ã«ã€ããŠæ¥µæ§ïŒããžãã£ãããã¬ãã£ãïŒãã¢ãããŒã
- åŠå®ã«ãã£ãŠã極æ§ããŸã£ããå€ãã£ãŠããŸãããšã
- 粟床ãèŠããšãBigram + ãã€ãŒããã€ãºããå°ãè¯ãã ã
- ãã ãã"not good" "not terrible" ãªã©ã®åŠå®ã䌎ã衚çŸã§éåžžã«é«ç²ŸåºŠ
- å€ãã®æ¡åŒµ
- ã»ã«ã®å®çŸ©ãæš¹ç¶LSTMãNåã®åç¯ç¹ãããã°ã©ãã³ã°èšèªãžã®å¿çš
- ãªã«ã¬ã³ãïŒç¹°ãè¿ãåïŒãšã®æ¯èŒ
- å©ç¹ïŒæ§æã«åŸã£ãæå³ã®è¡šçŸãè¯ãåž°çŽçãã€ã¢ã¹ãåŸé ã®éäŒæè·é¢ãçããäžéç¯ç¹ãžã®ã¢ãããŒã·ã§ã³
- æ¬ ç¹ïŒæ§ææšãå¿ èŠãå³åå²ããæåŸé ã®äŒæè·é¢ãé·ãããããèšç®ãã«ãã
-
æ§æè§£æ
- RNN ææ³
- RNN ã䜿ã£ãŠã·ã³ãã«ïŒå¶åŸ¡ã·ã³ãã«ãçæ
- æšã®ãããããŠã³ãå·Šâå³ã®è¡šçŸãçæ (æšã®SåŒè¡šçŸ)
- ã¹ã¿ãã¯ã«ãããŸã§ã«çæãããã·ã³ãã«ãä¿æ
- 次ã®ã¢ã¯ã·ã§ã³ã®ç¢ºçãã©ãæ±ããã
- é·ãã«äžéãç¡ã â RNN
- éšåæšã®è€éãã«äžéãç¡ã â ååž°çãã¥ãŒã©ã«ããã
- ç¶æ ãããŸãæŽæ°ããªã â stack RNN
- ã¹ã¿ãã¯RNN
- PUSH ãš POP ã®ïŒã€ã®æŒç®
- PUSH ããæã«åã«ãã£ãèŠçŽ ãšã®éã«æ¥ç¶ãäœã â æ¥ç¶ãæšæ§é ãšåãã«ãªã
- ç³»åçæ°è¿æ§ãããæ§æã®æ°è¿æ§ãéèŠ
- ãã©ã¡ãŒã¿æšå®
- çæã¢ãã« p(x, y), æ x ãšæ§ææš y
- èå¥ã¢ãã« GEN ã®ãããã«ãSHIFT æäœã䜿ã
- è§£æ
- ããŒã ãµãŒãã䜿ã
- æ¡ä»¶ä»ãã®èšèªã¢ãã«ãã¢ã¯ã·ã§ã³åã®ç¢ºçãã¢ããªã³ã°ããŠãã
- çµæ
- çæã¢ãã«ã®æ¹ãè¯ããåŸæ¥ææ³ããé«ã Få€
- èšèªã¢ãã«ãšããŠäœ¿ããš LSTM+dropout ããè¯ã
-
åèªã®è¡šçŸ
- ä»»ææ§
- car - c + b = bar
- cat - b + b = bat (åãæŒç®ãªã®ã«çµæãå šãéã)
- æåã«æå³ã¯ãããïŒ
- cool cooool coooooool â æå³ãäºæž¬ã§ãã
- cat + s = cats, bat + s = bats â èŠåç
- åèªãæ§é ã®ãããªããžã§ã¯ããšããŠèŠã
- 圢æ çŽ è§£æãã圢æ çŽ ããšã®ãã¯ãã«ãåæ
-
- å ã®ãã¯ãã«ã2. 圢æ çŽ ãã¯ãã«ã®åæã3. æåãã¯ãã«ã®åæããé£çµ
- è€æ°ã®çæã¢ãŒããæ··åããåèªãçæ
- ãã«ã³èªãšãã£ã³ã©ã³ãèªã§ã®èšèªã¢ããªã³ã°
- ã¢ãŒããå¢ããã»ã©ãæ§èœãåäž
- ä»»ææ§
-
ãã¥ãŒã©ã«ãããã®èšèªæŠå¿µã®è§£æ
- Lizen, Dupoux, Goldberg 2017
- èšèªã¢ããªã³ã°ã®ä»£ããã«ãæ°ã®äžèŽãäºæž¬
- Wikipedia ã®æãèªåã§ã¢ãããŒã·ã§ã³
- çµæ
- éäžã«ã¯ããŸãåè©ãç¡ãå Žå â è·é¢ã14ãŸã§åºæ¬çã«ãšã©ãŒç㯠0
- éäžã«ã¯ããŸãåè©ã®æ°ã«åœ±é¿ããã (ãšã©ãŒçïŒ%ã»ã©)
- ä»ã®å®éš
- ææ³çããéææ³çã
- èšèªã¢ããªã³ã°ãç®ç颿°ãšããå ŽåãããŸããããªããæ§æãäžè¬çã«åŠç¿ããŠããããã§ã¯ãªã
- ææ³ãã§ãã«ãŒãäœãå Žåã¯æ³šæïŒïŒææ³ãçŽæ¥åŠç¿ãããã»ããè¯ãïŒ
-
ãŸãšã
- èšèªåŠã®å©ç¹ïŒããè¯ãã¢ãã«ãäœãããã¢ãã«ããã¡ããšåããŠããã調ã¹ããã
| è±èª | æ¥æ¬èª |
|---|---|
| empirical | çµéšç㪠|
| constituent | æ§æèŠçŽ |
| parse | æ§ææš, æ§æè§£æçµæ |
| idiosyncratic | ç¬ç¹ã® |
| confound | 亀絡 |