Disrupt Berlin 2018 sponsor Otter Voice Notes uses AI to create live, searchable transcrip...

Posted:
in iPhone
One of the most overtly impressive products on display at Disrupt Berlin was Otter Voice Notes, a live transcription service from AISense, powered by artificial intelligence. The product generates searchable archives of presentations, meetings, interviews, or any other audio feed, delivering transcribed text synchronized to an audio recording, with identified speakers and keyword metadata.




The product was hard to miss at this week's Disrupt Berlin 2018 conference, as it was used to drive an accessibility display next to the stage that provided live dictation for users with hearing impairments (above).

The service also automatically delivered Disrupt conference transcripts to the free Otter Voice Notes app, organized by session. Without any editing, the service does an impressive job of dictating speech and then cutting up the raw text feed into blocks for each unique speaker. With some editing and manual identification of speakers, the system can learn to identify who is talking and automatically turn a panel discussion into what looks like a screenplay (below, the simplified view available from a browser).




After you manually identify a few speech bubbles, the service learns to identify that speaker in the future. This results in organized transcripts of business meetings or other discussions. Transcripts can be reviewed by the user or shared among teams, using an interface that looks like a voice-generated form of the Slack messaging app, with sets of conversations organized within groups. Subjects discussed are searchable by tagged keywords or by subject, and you can quickly get the gist of a meeting by looking at its automatically generated word cloud.



The Otter Voice Notes mobile client works on both Android and iOS, but the iOS version includes support for some extra features, including local calendar and contacts integration, support for Dark Mode, AirPrint, Touch/Face ID conversation locking, one-touch Widget recording, and Siri Shortcuts.

The highly rated app is free and the service is free to use for the first 600 minutes of audio processed. AISense offers a Premium version that includes 6000 minutes of transcription each month for $9.99, along with prioritized email support and additional audio playback speeds. There's also a $4.99 version for students and teachers.

The mobile app also supports integration with the Camera app, which automatically imports any photos taken during a recording session and inserts them into the dictated conversation. The service also integrates with Zoom teleconferencing cloud recordings, saving webcasts as transcripts in your Otter Voice Notes account. This makes it easy to create searchable archives of calls that are already being recorded.

Asked by a user about integration with products like Amazon Alexa and Google Home, a representative of the company replied "we don't see the enterprise using voice appliances," a reminder that the "voice first" media narrative behind those products is far away from the current reality.

In fact, rather than voice being an incredible computing interface, products like Otter Voice Notes--and other uses of dictation, including Apple's own Visual Voicemail--show that searchable, readable text in a visual user interface is often capable of exposing information that would be otherwise difficult and time-consuming to review later.

AISense also notes that it plans Otter integration with phone calls, which it says is "coming soon." In addition to live recordings and Zoom integration, Otter also lets you import audio from a variety of sources including MP3, AAC, M4A and WMA audio, and MP4, AVI, MOV, WMV, and MPG video. On iOS, you can even share audio from any app by installing Otter's iOS Share Sheet.

The service is currently limited to English, although it seemed to work quite well (albeit not perfectly) across a wide range of speakers at Disrupt who were either not native English speakers or who had a distinct regional accent-- in some cases understanding the speaker better than I could. If Otter makes a mistake, you can edit the transcript, listening to the recorded audio to review what was actually said.

Otter Voice Notes initially launched in February at Mobile World Conference from AISense, which calls its recognition technology "Ambient Voice Intelligence." The startup is based in Silicon Valley and claims a team of PhDs and engineering voice experts from Google, Facebook, Yahoo, and Nuance.

Other event sponsors of interest at Disrupt Berlin: Here, Universe

HERE Technologies, which resulted from Nokia's divestiture of its mapping unit, was on location to promote its new fremium model for providing developers with map, geocoding, routing and place data.

The firm also called attention to its free Here WeGo app, which provides an alternative to Apple's bundled Maps (and the privacy qualms of using Google Maps). It offers more public transit information (across 1,300 cities) and supports downloading offline maps-- in the U.S., Australia, Canada, the UK, France, Germany, Italy, Spain, and 100 other countries-- for use when data service may not be reliable. It also provides an iMessage App extension for calling up and sharing addresses while in a conversation (below).


Here WeGo includes an iMessage App


Universe Events, another Disrupt sponsor that the event itself made use of, provides event management for the hosts of concerts, classes, and festivals. That includes ticket processing for attendees, integration with Stripe to make sales funds available to the event coordinator in real-time, and integrated website embedding of the ticketing process so that users aren't redirected to an external site for processing.

The firm's Universe Discover app provides a local events guide for individuals, with social features to enable users to follow friends and their favorite event organizers to find upcoming events. Users can buy tickets using Apple Pay and immediately download a barcode ticket to their mobile device for entry.

Comments

  • Reply 1 of 1
    This looks like a great application of an automated transcription service, something the industry has been calling out for years and is likely to be the end of the dark art of stenography.
    Our trials of ai transcription services have been fantastic for single speakers, so I can see this really taking off next year. Multiple speakers in more natural environments are still rather challenging for asr, and I believe it will still be a few more years until this kind of service can offer quality transcripts. A hybrid solution which combines both human transcribers and automated transcription is currently the best method for anything more complex.
    Take Note https://takenotetyping.com & transcribeme.com offer the best of these hybrid solutions at the moment. 
Sign In or Register to comment.