自定义SRGS基于格拉默自由格式文本自定义、格拉、文本、格式

由网友(野性稳江山)分享简介:我想开发一个基于语音的应用程序,将接受用户输入的语音并执行基于输入一些行动。这是我第一次涉足这项技术,我边学边开发它。 I am trying to develop a Voice based application that would accept user input as speech and perform...

我想开发一个基于语音的应用程序,将接受用户输入的语音并执行基于输入一些行动。这是我第一次涉足这项技术,我边学边开发它。

I am trying to develop a Voice based application that would accept user input as speech and perform some actions based on the input. This is my first ever venture into this technology and I am learning while developing it.

我使用微软的SAPI附带的dotnet 4识别语音。到目前为止,我已经了解了两种模式的支持。

I am using Microsoft SAPI shipped with dotnet 4 to recognize speech. So far, I have learned about the two types of modes it supports.

语音识别(SR)具有两种操作模式:

Speech recognition (SR) has two modes of operation:   

听写模式 - 不受约束的,形式自由的讲话   使用由一个内置的语法提供帧间pretation模式   识别为特定语言。这是默认的识别。

Dictation mode — an unconstrained, free-form speech interpretation mode that uses a built-in grammar provided by the recognizer for a specific language. This is the default recognizer.

语法模式 - 匹配口语单词的一个或多个具体的上下文无关文法(CFGS)。一个CFG是一种结构,它定义了一个   特定组词,以及这些词语的,可以是组合   使用。从本质上讲,一个CFG定义是有效的句子   SR。文法必须通过的形式的应用程序来提供   precompiled语法文件或运行时在W3C的形式提供   语音识别语法规范(SRGS)标记或旧的   CFG规范。在Windows SDK包括一个语法编译:   gc.exe。

Grammar mode — matches spoken words to one or more specific context-free grammars (CFGs). A CFG is a structure that defines a specific set of words, and the combination of these words that can be used. In basic terms, a CFG defines the sentences that are valid for SR. Grammars must be supplied by the application in the form of precompiled grammar files or supplied at runtime in the form of W3C Speech Recognition Grammar Specification (SRGS) markup or the older CFG specification. The Windows SDK includes a grammar compiler: gc.exe.

所以基本上,不管的话我的语法规定,发动机将只承认。但我还需要包括伴随着结构的语法一些自由形式的文本。对于一个例子可以是人的名字。如果我想从语音捕捉的名字,我需要有与语法指定名称,但这是不可能的,如果应用程序是开放的,任何人使用。

So essentially, whatever words I specify with the grammar, the engine would recognize only those. But I also want to include some free form text along with the structured grammar. An example for that can be names of people. If I want to capture the name from the speech, I need to have that name specified with in the grammar, but that's not possible if the application is open for anyone to use.

有没有一种方法,我可以提取一些文字是不是已经语法的一部分?

Is there a way I can extract some text which is not a part of the grammar already?

我怎样才能让系统识别的句子,如我的名字是加里,我25岁。名称可以是绝对的东西,我怎么在我的语法定义它?

How can I get the system to recognize sentences such as "My name is Gary and I am 25 years old". The name can be absolutely anything, how do I define it in my Grammar?

推荐答案

您可以混合听写模式语法模式,看到这个例子从MSDN:

You can mix dictation mode with grammar mode, see this example from MSDN:

http://msdn.microsoft.com /en-us/library/ms723634(v=vs.85).aspx

<GRAMMAR>
    <!-- command to handle first and last names with semantic properties -->
    <!-- By using semantic properties, the application can ignore all of
        the text returned, except for the text associated with the dictation
        tags' semantic properties "PID_FirstName" and "PID_LastName" -->
    <RULE ID="SubmitName" TOPLEVEL="ACTIVE">
        <P>
            my first name is
            <!-- Note the implicit maximum is only one word -->
            <DICTATION PROPID="PID_FirstName"/>
            and my last name is
            <!-- Note the implicit maximum is two words -->
            <DICTATION PROPID="PID_LastName" MAX="2"/>
        </P>
    </RULE>
</GRAMMAR>
阅读全文

相关推荐

最新文章