We consider the problem of tracking physical browsing by users in indoor spaces such as retail stores. Analogous to online browsing, where users choose to go to certain webpages, dwell on a subset of pages of interest to them, and click on links of interest while ignoring others, we can draw parallels in the physical setting, where a user might walk purposefully to a section of interest, dwell there for a while, gaze at specific items, and reach out for the ones that they wish to examine more closely. As our first contribution, we design techniques to track each of these elements of physical browsing using a combination of a firstperson vision enabled by smart glasses, and inertial sensing using both the glasses and a smartphone. We address key challenges, including energy efficiency by using the less expensive inertial sensors to trigger the more expensive vision processing. Second, during gazing, we present a method for identifying the item(s) within view that the user is likely to focus on based on measuring the orientation of the user’s head. Finally, unlike in the online context, where every webpage is just a click away, proximity is important in the physical browsing setting. To enable the tracking of nearby items, even if outside the field of view, we use data gathered from smart-glasses-enabled users to infer the product layout using a novel technique called AutoLayout. Further, we show how such inferences made from a small population of smart-glasses-enabled users could aid in tracking the physical browsing by the many smartphone-only users.