Speech recognition (also known as automatic speech recognition or computer speech recognition ) converts spoken words to text. The term "voice recognition" is sometimes used to refer to speech recognition where the recognition system is trained to a particular speaker - as is the case for most desktop recognition software, hence there is an element of speaker recognition, which attempts to identify the person speaking, to better recognize what is being said. Speech recognition is a broad term which means it can recognize almost anybody's speech - such as a call-centre system designed to recognize many voices. Voice recognition is a system trained to a particular user, where it recognizes their speech based on their unique vocal sound.

Speech recognition applications include voice dialing (e.g., "Call home"), call routing (e.g., "I would like to make a collect call"), domotic appliance control and content-based spoken audio search (e.g., find a podcast where particular words were spoken), simple data entry (e.g., entering a credit card number), preparation of structured documents (e.g., a radiology report), speech-to-text processing (e.g., word processors or emails), and in aircraft cockpits (usually termed Direct Voice Input).

History

The first speech recognizer appeared in 1952 and consisted of a device for the recognition of single spoken digits Another early device was the IBM Shoebox, exhibited at the 1964 New York World's Fair.

One of the most notable domains for the commercial application of speech recognition in the United States has been health care and in particular the work of the medical transcriptionist (MT). According to industry experts, at its inception, speech recognition (SR) was sold as a way to completely eliminate transcription rather than make the transcription process more efficient, hence it was not accepted. It was also the case that SR at that time was often technically deficient. Additionally, to be used effectively, it required changes to the ways physicians worked and documented clinical encounters, which many if not all were reluctant to do. The biggest limitation to speech recognition automating transcription, however, is seen as the software. The nature of narrative dictation is highly interpretive and often requires judgment that may be provided by a real human but not yet by an automated system. Another limitation has been the extensive amount of time required by the user and/or system provider to train the software.

A distinction in ASR is often made between "artificial syntax systems" which are usually domain-specific and "natural language processing" which is usually language-specific. Each of these types of application presents its own particular goals and challenges.

Applications

Health care

In the health care domain, even in the wake of improving speech recognition technologies, medical transcriptionists (MTs) have not yet become obsolete. Many experts in the field anticipate that with increased use of speech recognition technology, the services provided may be redistributed rather than replaced. Speech recognition is used for blind people, which is very helpful.

Speech recognition can be implemented in front-end or back-end of the medical documentation process.

Front-End SR is where the provider dictates into a speech-recognition engine, the recognized words are displayed right after they are spoken, and the dictator is responsible for editing and signing off on the document. It never goes through an MT/editor.

Back-End SR or Deferred SR is where the provider dictates into a digital dictation system, and the voice is routed through a speech-recognition machine and the recognized draft document is routed along with the original voice file to the MT/editor, who edits the draft and finalizes the report. Deferred SR is being widely used in the industry currently.

Many Electronic Medical Records (EMR) applications can be more effective and may be performed more easily when deployed in conjunction with a speech-recognition engine. Searches, queries, and form filling may all be faster to perform by voice than by using a keyboard.

Military

High-performance fighter aircraft

Substantial efforts have been devoted in the last decade to the test and evaluation of speech recognition in fighter aircraft. Of particular note are the U.S. program in speech recognition for the Advanced Fighter Technology Integration (AFTI)/F-16 aircraft (F-16 VISTA), the program in France on installing speech recognition systems on Mirage aircraft, and programs in the UK dealing with a variety of aircraft platforms. In these programs, speech recognizers have been operated successfully in fighter aircraft with applications including: setting radio frequencies, commanding an autopilot system, setting steer-point coordinates and weapons release parameters, and controlling flight displays. Generally, only very limited, constrained vocabularies have been used successfully, and a major effort has been devoted to integration of the speech recognizer with the avionics system.

Some important conclusions from the work were as follows:

  1. Speech recognition has definite potential for reducing pilot workload, but this potential was not realized consistently.
  2. Achievement of very high recognition accuracy (95% or more) was the most critical factor for making the speech recognition system useful — with lower recognition rates, pilots would not use the system.
  3. More natural vocabulary and grammar, and shorter training times would be useful, but only if very high recognition rates could be maintained.

Laboratory research in robust speech recognition for military environments has produced promising results which, if extendable to the cockpit, should improve the utility of speech recognition in high-performance aircraft.

Working with Swedish pilots flying in the JAS-39 Gripen cockpit, Englund (2004) found recognition deteriorated with increasing G-loads. It was also concluded that adaptation greatly improved the results in all cases and introducing models for breathing was shown to improve recognition scores significantly. Contrary to what might be expected, no effects of the broken English of the speakers were found. It was evident that spontaneous speech caused problems for the recognizer, as could be expected. A restricted vocabulary, and above all, a proper syntax, could thus be expected to improve recognition accuracy substantially.

The Eurofighter Typhoon currently in service with the UK RAF employs a speaker-dependent system, i.e. it requires each pilot to create a template. The system is not used for any safety critical or weapon critical tasks, such as weapon release or lowering of the undercarriage, but is used for a wide range of other cockpit functions. Voice commands are confirmed by visual and/or aural feedback. The system is seen as a major design feature in the reduction of pilot workload, and even allows the pilot to assign targets to himself with two simple voice commands or to any of his wingmen with only five commands.

Helicopters

The problems of achieving high recognition accuracy under stress and noise pertain strongly to the helicopter environment as well as to the fighter environment. The acoustic noise problem is actually more severe in the helicopter environment, not only because of the high noise levels but also because the helicopter pilot generally does not wear a facemask, which would reduce acoustic noise in the microphone. Substantial test and evaluation programs have been carried out in the past decade in speech recognition systems applications in helicopters, notably by the U.S. Army Avionics Research and Development Activity (AVRADA) and by the Royal Aerospace Establishment (RAE) in the UK. Work in France has included speech recognition in the Puma helicopter. There has also been much useful work in Canada. Results have been encouraging, and voice applications have included: control of communication radios; setting of navigation systems; and control of an automated target handover system.

As in fighter applications, the overriding issue for voice in helicopters is the impact on pilot effectiveness. Encouraging results are reported for the AVRADA tests, although these represent only a feasibility demonstration in a test environment. Much remains to be done both in speech recognition and in overall speech recognition technology, in order to consistently achieve performance improvements in operational settings.

Battle management

Battle Management command centres generally require rapid access to and control of large, rapidly changing information databases. Commanders and system operators need to query these databases as conveniently as possible, in an eyes-busy environment where much of the information is presented in a display format. Human-machine interaction by voice has the potential to be very useful in these environments. A number of efforts have been undertaken to interface commercially available isolated-word recognizers into battle management environments. In one feasibility study speech recognition equipment was tested in conjunction with an integrated information display for naval battle management applications. Users were very optimistic about the potential of the system, although capabilities were limited.

Speech

SpyPro's New Micro Digital Voice activated Recorder - eBay (item ...

eBay: Find SpyPro's New Micro Digital Voice activated Recorder in the Electronics , Gadgets , Voice Recorders, Dictaphones category on eBay.

...

Amazon.com: JWIN JX-R26 Voice-Activated Microcassette Recorder with ...

This item: JWIN JX-R26 Voice-Activated Microcassette Recorder with Microphone/Earphone . In Stock. Ships from and sold by Amazon.com. This item ships for FREE with Super Saver ...

...

Digital Voice Recorder | Digital Voice Recorders | Voice Activated ...

Best Prices on Digital Voice Activated Recorders including superior sound quality and longer recording times. Fast Delivery.

...

Amazon.com: JWIN JXR16 Voice-Activated Micro Cassette Recorder ...

This item: JWIN JXR16 Voice-Activated Micro Cassette Recorder . In Stock. Ships from and sold by Amazon.com. Eligible for FREE Super Saver Shipping on orders over $25.

...

Small Voice Activated Digital Phone / Audio Covert Spy Recorder

Telephone / voice / audio digital spy recorders perfect for covert spy operations where being discrete is key. Monitor a room, office, car or vehicle!

...

Voice Recorder: Voice Activated Telephone / Voice Recorder 65 Hour ...

Recorder: slim digital recording device, store up to 65 hours of recordings.

...

Voice-Activated Tape Recorder : Voice Recorders : Maplin

In Stock and Now only 19.99 Over 15,000 electronics products plus AMAZING SAVINGS in our sale. Claim FREE MONEYOFF VOUCHERS. FREE DELIVERY when you spend over 35. PayPal accepted.

...

Voice Activated recorder

I would like to take a voice activated recorder to class w/ me. What are the brands that work best and pick up voices several feet away? Going to college sometimes I end up sitting ...

...

Sleuthgear voice activated Phone/Spy Digital Recorder 1GB w/ SD Card ...

Sleuthgear voice activated Phone/Spy Digital Recorder 1GB w/ SD Card Slot - SleuthGear Digital Recorder with micro SD Expansion Slot Item # D4000 This is the only recorder you will ...

...

voice activated recorder - Stealth recording telephone voice to MP3 ...

Records all sound activity around computer using sound card. Some key features are voice activated recording, multiple sound card support, all possible sound formats supproted ...

...