Object recognition using depth-sensors such as the Kinect device has received a lot of attention in recent years. Yet the limitations of such devices such as large noise and missing data makes the problem very challenging. In this work I propose a framework for data-driven object recognition that uses a combination of local and global features as well as time varying depth information