Category ArchiveTalk
Education & Speech user interface & Talk puneet_IITguwahati on 30 Aug 2008
Speech User Interface Lecture - IIT Guwahati
Hi Ecebloggers,
This time, I decided to write something about the current activities in my college, IIT Guwahati. After completing my internship at UNH , I always thought how I could promote the current research work at UNH and my pilot experiment work for obstacle testing during the internship period. Fortunately, the electronics department in our institute has initiated a lecture series on certain crucial areas of research in the electronics domain. This lecture series is being organized and managed by Cepstrum, the IIT Guwahati ECE society. I volunteered to contribute in this lecture series, the topic being “Speech Processing and In vehicle Interaction”. Quite Interestingly, much more students showed up for this lecture than was expected. I gradually commenced from defining speech synthesis , speech recognition etc. and later slowly paced up the proceedings covering the in - vehicle speech user interface. For elucidating an “in vehicle speech user interface” in a limited time, nothing could serve better than the demo video available for the Project54 speech user interface on the Catlab website. I then moved on to expand some details about my work at UNH and then finally covered the future prospects and current research at the Microsoft Research Lab, US. I used the Microsoft Research Driving simulator video to show the future directions in this field. Since some of the listeners were first year and second year undergraduates, I had to restrict myself to the basics of speech without going into the technical details. Here is a picture from the lecture and the rest can be found on this website.

Now, something about the listeners’ responses..Some of the second year undergraduates were interested to know about certain specific fields like keyword spotting and speech recognition. Since my final year B.Tech project at IIT Guwahati is on “Speech based Emotion recognition” , I was able to suggest some parameters imperative for emotion and speech recognition but again, without going into the finer technical details. Moving on, some juniors were interested to know how the lane variance and the steering wheel angle variance measures could be used to actually improve the driving performance. To vividly reply that query, I mentioned about the current research work being done in Project54 Lab to improve the driver performance. I remembered reading Oszkar’s paper on the wireless Push to Talk Glove and thus quoted it as an example, elaborating how it is better than the fixed PTT switch and helps to improve the driving performance. Finally, I concluded the lecture providing the website address for eceblogger as a resource for information about the latest proceedings in the Project 54 Lab. Thus to summarize , I had a wonderful time and I hope I could have incited some enthusiasm in my junior undergraduates.
Puneet Lakhanpal
Education & People & Talk & Telematics & Ubicomp Andrew Kun on 23 May 2008
Ian Cassias defends MS thesis
On Tuesday, Ian Cassias defended his MS thesis. Ian worked in the field of telematics and he was interested in three topics: remote diagnostics of vehicles, vehicle fleet management and traffic monitoring.

My favorite part of Ian’s thesis is his work on traffic monitoring. Ian looked at how the police radar could be used to estimate traffic volume for a given segment of road and how fast the traffic is moving. In order to do this, Ian looked at the number of car velocity readings the radar reports, and the actual values reported. From these numbers he attempted to characterize road conditions along two axes: the slow-fast axis and the light traffic-heavy traffic axis. Ian’s pilot study shows that the police radar could very well be used to monitor traffic. If we can further develop this system we could make police cruisers into a set of roaming traffic probes. Data from the cruisers could be used for traffic prediction and, if wireless communication is available, for real-time traffic reports.
Nice work Ian!
Andrew Kun
People & R&D & Talk Andrew Kun on 20 May 2008
Pavlo Melnyk PhD defense
Last week Pasha Melnyk defended his PhD dissertation, entitled “Biologically inspired composite image sensor for deep field target tracking.”

Pasha was interested in the problem of deep field tracking, or more specifically, he was interested in using image sensors to track objects from when they are very far from the observer all the way into the near field, when they are close to the observer. Pasha proposed a system in which multiple image sensors of different focal lengths create a composite image sensor that can achieve this type of tracking. However, he then ran into the problem of how to recognize and track objects in this new composite image. Will objects have different characteristics as they move through space and get picked up by different parts of the composite sensor? Pasha found an elegant solution to this problem. He described the composite image by nesting the log polar representations of individual cameras. One result is that objects do not significantly change shape as they are tracked by the multiple cameras.
Pasha successfully applied his idea to the problem of vehicle tracking. He was able to track vehicles from several hundred meters and then capture license plates as the vehicles drove by. The videos of this were really impressive.
Great job Pasha (and Rich Messner, Pasha’s advisor).
Andrew Kun
People & R&D & Science & Talk Andrew Kun on 18 May 2008
Ray Kurzweil on grand challenges for engineering
Ray Kurzweil is well known for his work on optical character recognition, text-to-speech synthesis and speech recognition. Recently, he was invited by the National Academy of Engineering to be part of a committee charged with outlining the grand challenges for engineering in the 21st century. Last Thursday, Ray gave a talk on this subject at the Broad Institute in Cambridge, MA, and four of my students and I went to hear him speak.

A good portion of the 2 hour talk was devoted to two ideas. One is that careful analysis of data can allow you to predict the growth of particular technologies. The second is that the growth of most, if not all, (successful) technologies is exponential. E.g., the number of Internet hosts has risen exponentially over the years (see slide below). Note that the slide tells us that this trend wasn’t perceptibly altered by the boom and bust of the dot-coms. Ray’s take on this: Wall Street didn’t realize that the Internet was growing exponentially, extrapolated the seemingly linear growth of the early exponential curve and came to the wrong conclusion that the Internet is not a viable place to make money.

The rest of the talk was devoted to several technologies that in Ray’s opinion will make a difference in the next 20-25 years (note that he prefers to talk about this shorter time horizon rather than the next century). One set of technologies is related to biology. To frame this discussion, it’s worth taking in this quote:
“As remarkable as biology is, it’s full of downsides, and needs to be improved upon.”
That’s the spirit! So how will we do this? Well, in Ray’s opinion, our computational powers are rising fast enough that by 2029 we’ll have computers that will be able to pass the Turing test (btw, this will be OK, we’ll use them to extend our own abilities and the whole thing will not result in some takeover by the machines). Our abilities to simulate biological processes are already very powerful, and will only get more powerful, which in turn will allow us to engineer new cures rapidly. We’re in the process of developing miniature robots to be deployed in our bodies to fight disease and generally make us stronger (e.g. provide us with an extra boost of oxygen). And we also need to develop a way to quickly respond to any viral outbreak (man-made or natural).
Ray also predicted that ubicomp will play a central role in our future, since we’ll be “online all the time with augmented reality.” This was fun to hear for my students and me, since our Project54 research relies heavily on the ubicomp field. And, talking about students, the topic of education also came up. Ray pointed out that “passion, desire and skill to learn is what we nan give our students.” Very well put.
Overall, the talk was inspirational. Now I want to read this book! Thanks Ray,
Andrew Kun
DSP & People & R&D & Talk Andrew Kun on 10 Apr 2008
Kevin Short lecture
Yesterday I attended the College of Engineering and Physical Sciences Frontiers Lecture by UNH Math Professor Kevin Short, entitled “Disassembly, Repair and Rebuilding of Music using Mathematics.” As the title suggests, Kevin discussed his work on restoring old music recordings, including his Grammy Award-winning work on restoring an old Woody Guthrie recording. Kevin won his Grammy in collaboration with Nora Guthrie, Jorge Mateus, Steve Rosenthal, Jamie Howarth and Warren Russel-Smith. Jamie Howarth has a special place in this group as the founder of Plangent Processes, a pioneering company in the field of music repair.

The talk was excellent. Kevin introduced several complicated digital signal processing ideas and made them accessible to us all. In this post I will concentrate on only one topic from the talk - Kevin’s work on the Woody Guthrie recording.
When I first heard about Kevin’s work on restoring old music recordings I also heard the term wire recording, but never followed up on what it really meant. Well, it turns out that wire recording is a technique that magnetizes a steel wire to record sound. This technology was used to create the Woody Guthrie recording and, as you can imagine, it’s a challenge to get a 50 year old coil of steel wire to reproduce sound with any reasonable quality.
One problem with a wire recording has to do with the uneven speed of motion of the wire under the recording head during recording (all recording techniques that use moving parts have this problem to some extent). The uneven speed of the recording medium stretches out some sound segments in time, and it compresses others. Another problem is that handling of the medium may damage that medium, and this in turn results in similar stretching and compression of the sound in time. An extreme case of this latter problem is a cassette player chewing up your tape. The tape may still be playable but the sound it produces isn’t that great any more.
The Woody Guthrie recording suffered from this type of stretching and compression. However, Kevin and Jamie Howarth were able to restore the recording by taking advantage of a particular type of noise in the recording: powerline hum. You see, the wire recorder was plugged into a wall outlet, which provided AC current at a frequency of 60 Hz, and this frequency was pretty constant. The recording was contaminated with a 60 Hz sinusoidal noise that originated from the powerline. However, when you play back the wire recording, the frequency of this noise sinusoid fluctuates. Kevin and Jamie realized that they had a known source in the 60 Hz powerline and that, by observing the fluctuations of the frequency of the noise sinusoid on the wire, they could understand how all the other sounds were distorted as well. This knowledge can then be used to reverse the distortions (compress stretched out parts of the recording, and stretch out compressed ones). What an elegant idea!
Thanks for the excellent lecture Kevin, and thanks to the CEPS Dean, Joe Klewicki for sponsoring the lecture series. For more pictures from the even click here.
Andrew Kun
People & Speech user interface & Talk & User interface Andrew Kun on 26 Mar 2008
Susan Boyce visit to Project54 lab
Susan Boyce of Tellme visited the Project54 lab and gave a talk entitled “Designing for voice search.” The talk touched upon several Tellme projects, including 1-800-CALL-411, a service that allows callers to search for businesses, buy movie tickets, etc. Impressively, 75% of incoming calls to this number are handled completely by the Tellme speech software, without any human help. This is of great importance to Tellme, because clients who deploy directory assistance (or similar) voice systems pay only for calls that are fully handled and not handed off to a human operator.

Susan pointed out that companies such as Tellme (a Microsoft subsidiary), Yahoo and Google compete heavily in the mobile world of voice search. They all believe that, as mobile phones get more powerful and data plans become cheaper, mobile phones will take over from laptops as the primary gadget that’ll let you get driving directions, buy tickets, order pizza, etc. Actually, since Domino’s is a Tellme client, you can already download an application for some cellphones that’ll let you order pizza with the click of a button (or a voice command). You can even order the “usual” - not bad!
Susan also talked about the multimodal nature of mobile phones: user interface designers can take advantage of the phone’s display for information output while allowing you to talk to the phone for information input. Not having to rely exclusively on speech for information output is nice in many applications, for example when you’re asked to select an item from a list (e.g. a business from a list of close matches to your query).
After the talk Susan had a chance to check out our driving simulator and continue the conversation about speech interfaces with several of my students.

Thanks for visiting Susan and for giving a great talk!
Andrew Kun
Driving simulator & Education & PDA & People & Project54 & Talk & Technology & UNH ECE Erika Clifford on 21 Mar 2008
Exploring High Tech Day
Project54 participated in UNH’s Exploring High Technology Day today. We had 3 groups of 23-ish high school visitors come to the lab for a demonstration of the project. We had students from Londonderry, Bow, Newmarket, Portsmouth and some home schooled students. Andrew Kun, Project Director, introduced the project. Nate Purmort, Engineering Manager, explained the nuts and bolts of the project and gave the students hands-on demonstration of the system’s push to talk technology. Andras Fekete, graduate student, demonstrated the use of the handhelds and how they will be used in the field once deployed. Undergraduate student Mark Taipan highlighted the value of working in a program where he can apply what he learns and get paid! He also touched on his research project involving P54 and video cameras. Grad student Oskar Palinko set up and demonstrated our simulator’s newest feature, the eye tracker. Alex Shyrokov and Zeljko Medenica, also grad students, presented the simulator giving the students a variety of driving experiences including a snow storm. You would think we have had enough of the snow already! The demonstrations went well and we had 3 great groups of students. The students were involved, interested and had some thoughtful questions. It is the norm for students when trying out the simulator to perceive it as a giant video game, however, it is always nice when they recognize, or in this case ask the question, “how does this (the simulator) fit in with the project?”!!

Alex demonstrating the simulator and its value for researching distracted driving.

Oskar explaining the EyeTracker and its use for researching the visual habits of people while they are behind the wheel.
All in all it was a great day and we were thrilled to be part of it. We look forward to seeing some of these students in the future as UNH undergrads!
~Erika
People & R&D & Talk Andrew Kun on 07 Mar 2008
Paul Picciano talk
Paul Picciano of Aptima gave a talk to Project54 students, staff and faculty. Paul introduced his company Aptima, and talked about several projects Aptima has undertaken.

True to Aptima’s “human centered engineering” tagline, the projects Paul described all revolve around improving human performance with the help of technology. One such project is aimed at creating tools to evaluate how much automation is too much when designing airplanes. As it turns out, automating the operation of an airplane beyond a certain point reduces a pilot’s ability to safely operate the plane. Where is this point, or more accurately, what are the relevant dimensions and levels of automation and how do they influence pilot performance? Aptima is looking for the answer.
Aptima also works on many projects related to national security. This may be less than great news for international students interested in working for this company, since they may not be able to attain the required level of security clearance. Nevertheless, the projects themselves are very interesting. E.g. one project tackles the problem of operating multiple unmanned aerial vehicles (UAVs). The operator can set various parameters as he/she plans missions for the UAVs and receive intuitive visual feedback about UAV paths, enemy positions, etc.
The sense I got from Paul’s talk is that Aptima is an exciting place to work. The projects are varied, they are of obvious importance to our society and the solutions to the problems they pose often require research.
I really enjoyed the talk, and judging by their questions and comments so did the students. Thanks for visiting Paul!
Andrew Kun
