top of page

Navigation

Mobile APP Video Demonstration
of Navigation and Action Recognition

Our smart and affordable mobile APP can help you navigate conveniently while understanding your surroundings.

​

Convenient:

  • Speech recognition for giving navigation instructions

  • returned directions are read aloud

Functions:

  • Navigation

  • Action recognition

  • Object recognition

  • Scene text recognition

  • Currency recognition

Advantages

1

Higher Accuracy

Verified that the HAMT Vision Language Navigation algorithm has higher navigation accuracy than other VLN methods on multiple datasets.

2

Action Recognition

Optimized HAMT by incorporating a high-accuracy action recognition module to achieve both navigation and action recognition.

3

Assistance 

Created an assistance system that can further help visually impaired people navigate and understand their environments.

My 3D Model of Shanghai American School's Library

To apply AAVLN in the real-world, I will create more 3D models of environments for further Reinforcement training.

Through communicating with visually impaired people, I learned they hope to navigate in the following environments the most:

​

  • Shopping Malls

  • Hospitals

  • Metro Stations

  • Cafeterias

Navigation Difficulty

Visually impaired people often find it hard to navigate in different environments because they face the challenge of recognizing curved and distorted scene text. Although existing OCR systems have high accuracy in recognizing organized text, they are less refined in recognizing scene text, which is the main problem for the visually impaired. 

Expanding Application

Screen Shot 2023-04-15 at 4.49.33 PM.png

Robots

Challenge: need to understand their environment and navigate to destinations to complete tasks.

Action recognition and vision language navigation can direct robots to their destination to complete tasks, e.g. wash dishes, grab utensils, etc.(Kazakos et al.)

Screen Shot 2023-04-15 at 4.49.54 PM.png

VR/AR

AR challenge: Cannot fully understand the real environment and integrate it into the virtual world.

Action recognition can help them understand the real environments to synthesize the real and virtual worlds

bottom of page