Modern immersive virtual reality (IVR) often uses embodied con- trollers for interacting with virtual objects. However, it is not clear how we should conceptualise these interactions. They could be con- sidered either gestures, as there is no interaction with a physical object; or as actions, given that there is object manipulation, even if it is virtual. This distinction is important, as literature has shown that in the physical world, action-enabled and gesture-enabled learning produce distinct cognitive outcomes. This study attempts to understand whether sensorimotor-embodied interactions with objects in IVR can cognitively be considered as actions or gestures. It does this by comparing verb-learning outcomes between two conditions: (1) where participants move the controllers without touching virtual objects (gesture condition); and (2) where partici- pants move the controllers and manipulate virtual objects (action condition). We found that (1) users can have cognitively distinct outcomes in IVR based on whether the interactions are actions or gestures, with actions providing stronger memorisation outcomes; and (2) embodied controller actions in IVR behave more similarly to physical world actions in terms of verb memorization benefits.