Feed on Posts or Comments

Monthly ArchiveJune 2008



DSP & Just for fun & Software & Speech processing & User interface Ivan Elhart on 30 Jun 2008

Sony’s MP3 dancing robot - Rolly

Sony revealed an egg-shaped digital music player named Rolly (picture below) at the end of 2007, but I haven’t had the chance to see it until last weekend. It plays MP3 and AAC music files and supports direct music streaming over a Bluetooth connection. And it is able to dance.

Sony Rolly

The Rolly is more than an ordinary music player. Thus, it is motion-controlled robot with a bunch of sensors, color lights, and two flapping wings. It uses two wheels that surround the body to roll, wiggle, and spin. In vertical position the wheels can be used to change songs and adjust volume. The Rolly creates motion automatically by analyzing the music, so it can dance to any song. Also, there is a possibility of creating new motions or customizing exiting ones using PC software. You can see the Rolly’s dance in the video below. It is amazing how the sound and motion are synchronized.

Ivan Elhart

R&D zeljko.medenica on 30 Jun 2008

Visiting Microsoft Research

This summer I have a great opportunity to have an internship with Microsoft Research. The purpose of my work will be to perform experiments about the navigation devices used in vehicles. The most valuable thing about having an internship at Microsoft is a hand-on experience with things that may actually be used commercially, since all the research is geared toward practical applications. This is also a great way to expand the collaboration of UNH and Microsoft, since in this research we will be using our Project54 driving simulator. My work will be supervised by Tim Paek, with whom Professor Kun and I published a paper last year at the Interspeech 2007 conference in Belgium.

One interesting fact is that Microsoft Research was founded in 1991, and at that time Microsoft was one of the first software companies to create its own computer science research organization. There are more than 800 researchers involved covering over 55 areas of research in computer science, engineering, and general science. Some more interesting facts can be found here.

In order to set everything up about the projects that we will be working on this summer, I spent a week at Microsoft in Redmond. Although I had a chance to visit Microsoft shortly last year, I finally realized how big their campus is. On one occasion I had to take care of some administrative things in a building that is located on the very end of the campus. At first it did not look that far, but after walking for about half an hour I realized that they don’t have shuttles for nothing! Everything is very neatly organized and modern looking, as can be seen on the pictures below.

Microsoft Research building in the background

Microsoft Research building in the background.

Microsoft campus

A cafe next to one of the canteens.

There are also many convenient things on campus that are organized for their employees, such as the bus and shuttle transportation, sports fields, canteens, library, professional lectures and seminars etc.

Microsoft Research really looks like a very nice place to work at, since their employees have lots of freedom about organizing their own work and research. I am looking forward to having a great internship this summer.

Zeljko Medenica

Multitouch & Technology & Ubicomp & User interface Andrew Kun on 28 Jun 2008

TouchKit fabrication prototype picture on Flickr

TouchKit fabrication prototype (check!), originally uploaded by stfnix.

Apparently the folks at Nortd are getting closer to shipping some TouchKits. The picture above was just posted on June 27. This looks like a good size screen!

Andrew Kun

Driving simulator & People & Speech user interface oszkar on 27 Jun 2008

Automotive discussions at YRRSDS’08

This year’s Young Researchers’ Roundtable on Spoken Dialog Systems was a great place to share ideas with fellow speech research enthusiasts. Lots of attention was devoted to the design of spoken dialog systems in general (as the name of the event might suggest), but also to more specific areas, as automotive speech user interfaces. Besides the main program, the pleasant meal breaks provided a great environment for people to discuss research ideas in an informal manner.

YRRSDS lunch

I had some very interesting conversations concerning automotive topics during these breaks with Ben Reaves from Toyota ITC, Stefan Hamerich from Harman/Becker and Zeljko Medenica from Project54/UNH. We discussed the state-of-the-art of automotive speech user interfaces and where the field is headed. Most of us heard about Microsoft’s Sync Technology for Ford vehicles, thanks to their advertising campaign in the USA, but other big auto makers are also selling their cars equipped with speech recognition systems (e.g. Toyota, Honda, etc). Ben proposed, that now is the time to set standards for automotive speech user interfaces, which could be accepted by all relevant parties in business and research. We all agreed that the accepted standard might not be the best one (see VHS vs. Betamax), but still it could be very beneficial to the field, by focusing its development.

The poster session was cleverly piggy-backed onto the afternoon coffee break. This way people didn’t even notice that their minds were “on-duty” even during “recess”. I presented a poster on my ongoing research concerning push-to-talk button solutions for in-car speech user interfaces. It drew a bigger crowd than I imagined. Professor Alex Rudnicky from CMU was inquiring about the premises and methods of my research. Then, the automotive specialists, Stefan and Ben were joined by several other participants in discussing details about the poster with me.

Oskar poster
Stefan Hamerich, Ben Reaves, Antoine Raux, Oskar Palinko, Milica Gasic

We had a very good conversation on the advantages/disadvantages of high fidelity driving simulators in automotive research. They can provide lots of measured variables (lane position, steering wheel angle, distances, speeds, etc.), but in the same time researchers must cope with their possible undesired effects (e.g. motion sickness).

I found out, that informal discussions are a very effective way of sharing ideas within small groups of people. I enjoyed a lot to talk to fellow researchers about common interests at YRRSDS’08.

Oskar Palinko
Project54, UNH

Education & Software Andrew Kun on 26 Jun 2008

Visualizing sorting algorithms

Check out this link for a great visualization of sorting algorithms by David Martin of Boston College.

Andrew Kun

Speech user interface Nemanja Memarovic on 24 Jun 2008

Experience with DELL’s spoken dialog system

Hello ecebloggers,

A week ago I bought a laptop on DELL’s web site. After some difficulties I had experienced with their sales department I decided to call them and see what’s happening. Of course I ended up on their dialog system. I was asked to choose between “Home, Home office and business”. I said business and the system understood me. After answering a series of questions (”Are you calling about a recent purchase?” “Was the purchase made today?”…) I finally ended up with a simple “Yes or No” option. I had to repeat “No” several times before the system finally understood my “No!”. After all the trouble I went trough I landed in technical support! Wow, I didn’t expect that at all. I have no idea what happened because I answered all the questions but still didn’t end up where I wanted. Luckily the customer representative was very helpful and he answered my questions because he understood me, unlike DELL’s dialog system.

Hope you don’t go trough the same trouble as me (if you decide to buy a DELL).

Have a good one,

Nemanja Memarovic

Science & Technology & Tips and tools Andrew Kun on 23 Jun 2008

Writing about science (and technology)

Check out this article from Cognitive Daily on how to report on scientific research to a general audience. Note that many of the suggestions are equally valid when they’re applied to documents aimed at scientific or technical audiences (e.g. “Explaining your figures is crucial…”).

Andrew Kun

Conferences & Speech user interface & US travel oszkar on 22 Jun 2008

Reporting from YRRSDS08

My colleague, Zeljko Medenica and I are participating at this year’s Young Researchers’ Roundtable on Spoken Dialog Systems in Columbus, Ohio.

Columbus, Ohio

This is a very interesting event with lots of young speech researchers from all over the world. There are also representatives from different research and development companies interested in spoken dialog systems like: Microsoft Research, Nuance, VoiceObjects, AT&T, Toyota ITC, Harman/Becker, etc.

During the days of YRRSDS we have participated in several roundtable discussions, which covered some very interesting topics: multimodal systems, next killer-apps, dialog system develpoment, how to make spoken dialog systems human-like, etc.

roundtable

More detailed description on the topics will be provided in future posts.

Zeljko and I also presented our posters which summarized our research interests. Participants were very interested in hearing about our experiments and results. This will also be discussed in future posts.

Columbus is a very nice city with a huge university campus OSU (biggest single campus in the USA), where the event took place. The organizers put a lot of effort in making this roundtable a very interesting one.

Oszkar Palinko and Zeljko Medenica

Web Andrew Kun on 22 Jun 2008

YouTube long-form

Check out Robert Scoble’s post on longer videos coming to YouTube. He also has a link to a Mark Cuban post in which Mark argues that YouTube’s business model is flawed and that Hulu’s is better. However, Hulu only has content generated by NBC and Universal (such as full-length movies), and no user-generated content (such movies about Project54 research on YouTube) so it seems like we’re comparing apples and oranges. Perhaps a better comparison is between YouTube and Flickr, especially now that you can upload movies to Flickr. Both YouTube and Flickr are social networking sites, while Hulu is not.

Andrew Kun

Speech user interface & Ubicomp & User interface Andrew Kun on 18 Jun 2008

Tap input for mobile phones

I recently came across an interesting paper from Tangible and Embedded Interaction 2007, describing work at Nokia on gesture-based input for mobile phones.

One interesting aspect of the paper is the description of the first phase of the research in which the authors relied on videos presented to subjects online, to explore which gestures are socially acceptable. This is a nice way to quickly reach your audience without incurring large expenses. A similar approach was used by Mark Taipan and Matt Lape, who created a video to help support their SURF grant proposal (it appears that it helped, since the proposal was funded).

I’m also very interested in the tap input idea. Oszkar Palinko and I will be presenting a paper at Intelligent Environments 2008 on the use of tap input for signaling to a speech recognizer the start of speech input.

Finally, a note on the conference presentation of this paper. Take a look at page 18. The authors inserted a picture showing a relevant slide from the keynote presentation of the conference from the day before. What a nice way to help explain the contribution of the paper in relation to the body of relevant knowledge! And of course, it’s always good to do this by quoting people who may actually be at your talk, such as the keynote speaker of your conference ;)

Andrew Kun

Next Page »