Human–Robot Collaboration in Assembly Using Event-Based Pose Estimation

My name is Alf Stian, and I am a master’s student in mechatronics at the University of Agder. Through the “Student in Research and Innovation” programme, I am working on a project where a robot and a human share the same workspace during assembly tasks. The long-term goal is that the robot understands how the human is moving and can react in real time – for example by handing over the right tool at the right moment.

To do this safely, the robot needs a reliable estimate of the human’s body position, often called human pose estimation. Traditional RGB cameras record normal video, but they are not ideal for this setting. They struggle in very bright or dark industrial environments, they produce large amounts of data that are slow to process, and fast hand movements often appear as motion blur. For a system that must react in real-time, these limitations are a real problem.

Event cameras work very differently. Instead of sending full images at a fixed frame rate, they only report small “events” when something in the scene changes by observing changes in light. This makes them extremely fast, with very low latency and no motion blur. They also handle difficult lighting conditions much better and generate far less data. That makes them interesting for robots and other systems that run on limited computing power.

In our project, we combine an event camera and a normal RGB camera. The event camera is used to estimate the human’s pose and understand how the person is moving during an assembly task. The RGB camera is used in parallel to detect objects and tools in the workspace. Together, these sensors will give the robot enough information to support the human in a safe and predictable way.

My main contribution so far has been on the software and infrastructure side. I have been working on a digital twin of a UR5 robot using ROS2, Gazebo and RViz, so we can test everything safely in simulation before moving to the real robot. I have also worked on extracting data from the cameras for pose estimation using machine learning solutions and connecting them with the robot through ROS2 and Python. In addition to ensuring the data from the sensors can be streamed, synchronised and visualised in real time.

For me, the most interesting part of this project is seeing how hardware, software and artificial intelligence come together in a concrete application. I have learned a lot about ROS2, robot simulation and sensor integration, and I have gained a better understanding of what is needed to make human–robot collaboration both safe and useful. The next steps will be to integrate event-based pose estimation models into this setup and move gradually from simulation to experiments with the real robot.