¨Åé´Ý»Ù¦³¤fÃø¨¥ ¡u²´ÞÞÞÞ¡v¶Ç±¡¹F·N
¸£©M¤¬Ápºô¬ì§Þ¥O¤H»P¤Hªº·¾³q§ó«K±¶¡A¥´¦r¡B¿ýµ¡A¬Æ¦Ü»yµÅܤå¦r¡A´N¥i¥H§â¦Û¤vªº·Qªk°e¨ì¤d¨½¤§¥~¡C¦ý¦³¤fÃø¨¥¡B¨Åé´Ý»Ùªº¯f¤H¡A³sªñ¦b«¤¤Øªº¤H³£Ãø¥H·¾³q¡C¤¤¤å¤j¾Ç¬ãµo¤H¤u´¼¯à§Þ³N¡A¬°¤f¾¦¤£²Mªººcµ»Ùê¯g¯f¤H°µ¡u½Ķ¡v¡A¬°¥|ªÏÅõºÈ¯f¤Hµo°e«H®§¡C
¤å¡G§õªY±Ó¡B´Ó½«ë
§Þ³N1¡G²´°ÊAAC¨t²Î
¦pªG¤£¯à»¡¸Ü¡A¥i¥H¥Î¤å¦r¡B¤â»y·¾³q¡F¦ý¦pªG¤â¸}¤]¤£¯à¬¡°Ê¡A«ç»ò¿ì©O¡H»´ä¤¤¤å¤j¾Ç³Õ´¼·Pª¾¥æ¤¬¬ã¨s¤¤¤ß(Centre for Perceptual and Interactive Intelligence ¡A²ºÙCPII)¬ã¨s¶µ¥Ø¡u¨¥»y¤Î»y¨¥´¼¯à¡vªº¨ä¤¤¤@Ó¬ã¨sp¹º¡A§Æ±æ¥H¬ì§Þ¬°³o¨Ç¤H¡uµo¨¥¡v¡C¡u§Ú¨£¹L¦³¨Ç¦P¾Ç¥JÁöµM¨Åé°Ê¤£¤F¡A¦ý¥L̲´·ú«Ü¡yºëÆF¡z¡A·Qªí¹F¦Û¤v¡A«oê©ó¨Åé´Ý»Ù°µ¤£¨ì¡C¡v¤¤¤j«H®§¤uµ{¾Ç¨t«È®y§U²z±Ð±Â¡BCPIIº®u¬ã¨sû³¯b¯ôµo²{¡A¤@¨ÇÄY«´¼»Ù¾Çµ£±¹ïþÓ³o¨Ç§xÃø¡A©ó¬O»P¹Î¶¤¶}®i¬ã¨s²´°ÊAAC¨t²Î¡C
§ï¨}IJ±±¹q¤lª©·¾³qï
¦©ó2014¦~¡A³¯b¯ô°Ñ¥[¤@ÓAACÁ¿®y¡A»{ÃѨ즳¨Ç¤H¦³·¾³q§xÃø¡A¡u·í®É´N·Q¡A¬JµM§Ú¬O«H®§¤uµ{¨t¥X¨¡A¾Ç²ßcommunication engineering(³q°T¤uµ{)¡A¯à§_¾Ç¦³©Ò¥Î©O¡v¡H©ó¬O¦o¶}©l§ë¤JAAC¹q¤l¤Æ¬ã¨s¡C
³¯»P±d´_¾÷ºc»´äÄ£¯à¨ó·|¦X§@¡A¬ãµo¡u³q¸Ü©ö¡v³q°TÀ³¥Îµ{¦¡¡A¨t²Î¤º¦³¤£¦P¹Ï¥d¡A¥i²Õ¦¨¥y¤l¡A¨Ã¥i¥HµoÁn¡F§ó¹B¥Î¶³ºÝ»²§U³q°T§Þ³N¡A¨Ï¥ÎªÌ¥i»P¿Ë¤Í§Y®É³q¸Ü¡Cµ{¦¡Àò±o»´ä¸ê°T¤Î³q°T¬ì§Þ¼ú2020¡J´¼¼z¥«¥Á(´¼¼z¦@¿Ä)¼úª÷¼ú¡C
³¯¯d·N¨ì¡A¨Ï¥Î¶Ç²ÎAAC¹Ï¥d¡A¨Ï¥ÎªÌ©¹©¹¥u¿ï¤@±i¥d¡A¦ýÅܦ¨À³¥Îµ{¦¡«á¡A´N·|«Ü¿ãÅD¤@¦¸¿ï¦h±i¥d¡C
«×¨q°µ¼Æ¾Ú®w
¤£¹L¡A¦b¡u³q¸Ü©ö¡v´ú¸Õ´Á¶¡¡A¸£³ÂÞͱwªÌ¦]¤â¸}¤£ÆF¬¡¦ÓµLªk¨Ï¥ÎIJ±±¦¡AAC¡A¤@¨ÇÄY«´¼»Ù¾Çµ£¥ç¦³Ãþ¦ü§xÃø¡A«P¨Ï³¯b¯ô¶}µo²´°ÊAAC¡Cº¥ýn«Ø¥ß¤@ÓAAC¼Æ¾Ú®w¡A¹Î¶¤»`¶°¸ê®Æ¬O¾a³v¤@¬d³X·ÓÅUªÌ¡A¤F¸Ñ¨Ï¥ÎªÌ¥¤é¬¡°Ê©M·Qªk¡AÁöµM¤èªk¨Ã¤£high-tech¡A¦ý¦³«×¨q°µªº®ÄªG¡C¹Î¶¤¦¨ûªLªl¦¿©M¤ý°¶©¾ªí¥Ü¡A¨t²Î¥¥x«Ø³]«á¡Anªá¬Û·í¦h®É¶¡©M¤ß«ä»P·ÓÅUªÌ©M¨Ï¥ÎªÌ·¾³q¡A¥J²Ó½Õ¾ã¤º®e©M³]¸m¡A¤~¯à¤Á¦X¨Ï¥ÎªÌ»Ýn¡F¤¤¤jÂå¾Ç°|¦Õ»ó«|³ïÀYÀV¥~¬ì¾Ç¨t°Æ±Ð±Â¡B¨¥»yªvÀø¬ì¥DºÞ§õ¤ë»n¥ç¾á¥ô¶µ¥Ø¦@¦P¬ã¨sû(co-investigator)¡A¦b¬ãµo¹Lµ{¤¤´£¨Ñ±M·~·N¨£¡C
§Þ³N2¡G»yµ««Ø AI¡u½Ķ¡v¥N¨¥
¥t¤@¯Z¡u¦³¤fÃø¨¥¡vªº¯f¤H¡A¬O¦]¸£³¡¯e¯f¡B¯«¸g·l¶Ë¦Ó¤ÞPºcµ»Ùê¯g(dysarthria)¡C¤¤¤j¹Î¶¤³z¹LAI¡B»yµ««Ø§Þ³N©MÂX®i»Ùê»yµ¼Æ¾Ú®w¡A±N±wªÌ§t½k¤£²Mªº»yµ««Ø¦¨¥¿±`»yµ¡A¨Ã¦b¥h¦~űo»´ä¬ì§Þ¶é¡uSciTech Challenge³Ð·~¤ñÁÉ¡v¤½¶}²Õ«ax¡A±N³v¨B§ë¤JÀ³¥Î¡C
®Ú¾Ú¬ü°ê¨¥»y¡B»y¨¥¤ÎÅ¥¤O¾Ç·|(ASHA)¸ÑÄÀ¡Aºcµ»Ùê¯g¬O¤@ºØ¯«¸g©Ê»y¨¥»Ùê¡C¦¨¤H±wªÌ±`¨£¼xª¬¥]¬A»¡¸Ü®É¦b±j«×¡B³t«×¡Bµ°ì¡Béw¡B»y½Õ¥X²{²§±`¡F¦b©I§l¡BµoÁn¡B¦@»ï¡Bµoµ©Î»¡¸ÜÃý«ß¤W¤í·Ç¡F±³¡¡B¤f³¡µ¥ºcµ¦Ù¦×¥\¯à¤£²z·Q¡A©Î¦³¦Ù¦×µjÅË¡BµL¤Oµ¥±¡úG¥X²{¡C
»yµÃѧO§Þ³N(automatic speech recognition¡A²ºÙASR)¤Î»yµ¦X¦¨§Þ³N(text-to-speech¡A²ºÙTTS)¡A¬O¹Î¶¤¦h¦~¨Ó¤£Â_¬ã¨s¤Îµo®iªº»yµ§Þ³N¡C»´ä¤¤¤å¤j¾Çꮥéú¨t²Î¤uµ{»P¤uµ{ºÞ²z¾Ç±Ð±Â¡BCPII¨¥»y¤Î»y¨¥´¼¯à¶µ¥Øt³d¤H¤Îº®u¬ã¨sû»X¬ü¬Â¸ÑÄÀ¡AASR¬O«ü³z¹L¤H¤u´¼¯à ¡A®e³\¾÷¾¹±µ¨ü»yµ¿é¤J(speech input)¡A±q»yµ¼Æ¾Ú®w¤¤¾Ç²ß¤HÃþ»y¨¥¡A¦A±N¤H̪º»yµÅܬ°¤å¦r¡F¦ÓTTS«h¬O³z¹LAI¡A®e³\¾÷¾¹±N¤HÌ¿é¤Jªº¤å¦rÅܦ¨»yµ¿é¥X¡C
n¬ã¨s»yµ§Þ³N¡A´N¥²¶·¨Ï¥Î¤j¼Æ¾Ú¡C»X¬ü¬Â«ü¡A¸Ü»y¤¤¦s¦³«Ü¦hÅܤƩÊ(variability)©M¤£ÅÜ©Ê(invariance)¡CÅܤƩʥ]¬A¤£¦P»¡¸ÜªÌªº¦UºØ»y½Õ¡B±¡·P¡B¤fµ¡B¨Å骬úGµ¥¡A³£·|§ïÅܸܻyµo¥Xªº«H¸¹¡F¤£ÅÜ©Ê¥i¥H¬O»¡¸Ü®É¨Ï¥Îªº¦r²´¡C¦UºØÅܼƪº¥X²{¡A¥O»yµ§Þ³N¬ã¨s»Ý¥Î¤W¤j¼Æ¾Ú¿ëÃѸܻy¤º®e¡C
ªá¼Æ¦~ÂX®i¼sªF¸Ü¼Æ¾Ú®w
µM¦Ó¡A²{¦³»yµ¼Æ¾Ú®w¦s¼Æ¾Ú°¾»á¡A¦]¬°¼Æ¾Ú¤@¯ë¨Ó¦Û±q¨ÆIT¦æ·~¡B±`¥Î¼Æ½X¤Æ²£«~¡B»¡·í¦a»y¨¥®É¤fµ¸û¤Öªº¤H¡A¦h¼Æ¬°¦¨¦~¤H¤h¡F¬Û¤Ï¦³¤fµ¡B»y¨¥»Ùê±wªÌ¡BªøªÌ¡B¤pªB¤Íµ¥´N¨S¦³¨¬°÷ªº»yµ¼Æ¾Ú¤ä¼µASR§Þ³N¡A¨Ï¥L̪º»yµÃø¥H³Q·Ç½T¿ëÃÑ¡C¥t¥~¡A²{¦s¤½¶}ªº»yµ¼Æ¾Ú®w¡A³¡¤À»y¨¥¯Ê¥F¼Æ¾Ú¡A¥]¬A¼sªF¸Ü¡C
»X¬ü¬Â»P¦oªº¸ó¾Ç¬ì¬ã¨s¹Î¶¤¡A³z¹L»yµ««Ø§Þ³N(speech reconstruction)¡BASR¡BTTSªº¬ã¨s¡A¦P®Éªá¤W¼Æ¦~®É¶¡ÂX®i¼sªF¸Üªº»yµ¼Æ¾Ú®w¡A¨ó§Uºcµ»Ùê¯g±wªÌ»P·ÓÅUªÌ§ó¦n·¾³q¡C¥Ñ2013¦~¶}©l¡A¹Î¶¤»P»´ä¤¤¤å¤j¾ÇÂå¾Ç°|¤Î¯f¤Í²Õ´¦X§@¡AÁܽмƤQ¦W¤£¦P¦~ÄÖ¼hªººcµ»Ùê¯g±wªÌ¡A¿ý»s¥Ḻ`¥Îªº¼sªF¸Ü¥Î»y¡AÂX¥R»Ùê»yµ¼Æ¾Ú¡C¿ý»s¹Lµ{¤¤¡A±wªÌ·|Ū¥X¸g¬ã¨s¹Î¶¤ºë¤ß³]pªº¼sªF¸Ü¥y¤l¡A¹Á¸Õ¥Î³Ì¤Öªº¦r¥y¥]§t©Ò¦³¼sªF¸Üµoµ¡A¦¬¶°¬ã¨s©Ò»Ýªº»yµ¼Æ¾Ú¡C
¥HASR¡BTTS§Þ³N¬°°ò¦¡A¥H¤Î¹Î¶¤¶}µoªº»yµ««Ø§Þ³N¡A¨t²Î±µ¦¬¨ì±wªÌ§t½k¤£²Mªº»yµ«á¡A§Q¥ÎAIºtºâªk¡A±N§t½k¤£²Mªº»yµÂà¤Æ¦¨¥¿±`»yµ¡A³Ì«á¥H»yµÂà´«§Þ³NÂà´«¦¨¶Kªñ»¡¸Ü¤Hªºµoµ¤Î»y½Õ¡C
¹Î¶¤´Á±æ¤é«á¯à§â¦¹§Þ³N»s§@¦¨À³¥Îµ{¦¡¡AÅý§ó¦h¦³»Ýn¤H¤h¥i©ó¤£¦P¦a¤è¨Ï¥Î¡C±©»X±j½Õ¡A¡uAI¨Ã¤£¬O¸U¯à¡A¦]¬°¥@¬É¦U¹Î¶¤¦h¦~ªº¬ã¨s¡A¤~¥OAI¦³³oÓ¯à¤O¡v¡C