Web Speech APIによる音声認識・音声合成

Web Speech APIによりブラウザ上で動作する音声認識・音声合成のアプリを作成します。

音声認識・音声合成アプリの作成

音声認識・音声合成アプリを次に示します。「音声認識」ボタンを押すと、音声認識が開始され、発話された音声がコンソールログに表示されます。発話された音声に「開始」もしくは「終了」が含まれているかを判断します。「音声合成」ボタンを押すと「まわります」が発話されます。

10行目で「音声合成」ボタン、11行目で「音声認識」ボタンをそれぞれ作成します。
15行目で「音声合成」ボタンを押したときに処理を記述します。18行目で「まわります」が発話されます。
20行目で「音声認識」ボタンを押したときに処理を記述します。30行目で発話に「開始」が含まれているかを判断し、33行目で発話に「終了」が含まれているかを判断します。
音声認識が終了すると42行目に制御が移され、再度音声認識を開始します。
４6行目で音声認識を開始します。

speech.html

<html>

  <meta charset="utf-8" />
  <meta name="viewport" content="width=device-width, initial-scale=1" />
  <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css"  />
  <script src="https://code.jquery.com/jquery-3.2.1.min.js"></script>

  <body>
    <h3>音声合成・音声認識デモ</h3>
    <button class="btn btn-primary" id="Synthesis">音声合成</button>
    <button class="btn btn-primary" id="Recognition">音声認識</button>

    <script>

      $("#Synthesis").click(function () {
        const uttr = new SpeechSynthesisUtterance("まわります");
        // 発言を再生 (発言キューに発言を追加)
        speechSynthesis.speak(uttr);
      });
      $("#Recognition").click(function () {

        const recognition = new webkitSpeechRecognition();
        recognition.lang = "ja-JP";
        recognition.continuous = true;
        recognition.onresult = function (event) {
          // omit
          console.log("onresult3! " + event.resultIndex);
          console.log("onresult2! " + event.results[event.resultIndex][0]
                  .transcript);
          if (event.results[event.resultIndex][0].transcript.includes(
                  '開始')) {
            console.log("start! ");
          } else if (event.results[event.resultIndex][0].transcript.includes(
                  '終了')) {
            console.log("stop! ");
          }

        };
        recognition.onspeechend = function (event) {
          console.log("onspeechend! " + event);
        };
        recognition.onend = function (event) {
          console.log("onend! " + event);
          recognition.start();
        };
        recognition.start();
      });

    </script>
  </body>

</html>

音声認識・音声合成アプリの実行

ブラウザから音声認識・音声合成アプリ「speech.html」をアクセスすると次のように表示されます。「音声合成」ボタンを押すと「まわります」が発話されます。「音声認識」ボタンを押して「開始して」や「終了して」を発話すると、コンソールログに表示されます。

音声合成時に次のエラーメッセージが発生する場合があります。
「 [Deprecation] speechSynthesis.speak() without user activation is no longer allowed since M71, around December 2018. See https://www.chromestatus.com/feature/5687444770914304 for more details」
これはユーザーのアクティブ化なしのspeechSynthesis.speak()は許可されないということで、ボタンのクリック操作が必要になります。

音声合成アプリの作成

音声合成アプリを次に示します。「テキスト読み上げ」をチェックして、マウスをテキスト上に移動させるとアンダラインが表示されます。同時に対応するテキストが発話されます。

20-34行目で、テキスト上にマウスが置かれたときの処理を行い、35-41行目でテキスト上からマウスが移動したときの処理を行います。
22行目や36行目で「テキスト読み上げ」がチェックされているかを判断します。
27行目で「hover」クラスを追加し、6-7行目でアンダラインを表示します。

WebSpeech/index.html

<html>
<meta charset="utf-8" />
<script src="https://code.jquery.com/jquery-3.2.1.min.js"></script>

<style>
  .speech.hover {
    text-decoration: underline !important;
  }
</style>

<body>
  テキスト読み上げ：
  <input class="switch__input" type="checkbox" id="js-speech-ctrl">ON

  <div class="speech"><br>TomoSoftでございます。</div>

  <div class="speech">ご連絡をいたします</div>

  <script>
    $('.speech').on('click mouseenter', function () {
      console.log('mouseenterされました！');
      if ($("#js-speech-ctrl").prop('checked')) {
        if (!window.speechSynthesis) {
          alert("このブラウザは音声合成に対応していません");
        }
        else {
          $(this).addClass("hover");
          n = new SpeechSynthesisUtterance;
          n.text = $(this).text();
          n.lang = "ja-JP"
          speechSynthesis.speak(n)
        }
      }
    })
    $('.speech').on('click mouseleave', function () {
      if ($("#js-speech-ctrl").prop('checked')) {
        console.log('mouseleaveされました！');
        $(this).removeClass("hover")
        speechSynthesis.cancel()
      }
    })
  </script>
</body>

</html>

音声合成アプリの実行

ブラウザから音声合成アプリ「indexhtml」をアクセスすると次のように表示されます。「テキスト読み上げ」をチェックし、マウスをテキスト上に置くとアンダラインが表示され、テキストの内容が発話されます。