Designed the 3D point cloud annotation labeling experience for global contributors on the data labeling platform of Hive Data. The primary motivation of this design is to general annotation data for LiDAR models which is a key use case in self-driving.
What is LiDAR?
LiDAR is a 3D scanning technology that generates a Point Cloud, which are dots in space that represent physical surfaces in the scene. The scene may also consist of multiple frames that change over time like a video. LiDAR neural models are built using large 3D point cloud data sets that turns out to be the foundation of reliable self-driving technology.
The Challenge
How might we collect large 3D point cloud datasets from driving scenes for liDAR model training
The Solution
A web app targeting individual contributors who can earn easy cash rewards by contributing to image training by correctly marking and labeling objects in 3D driving environments
The Design Process
No inspirations? Secondary User Research
Since we couldn't find any industry examples of something similar existing at the time, we decided to user research on the closest product similar to 3D point cloud annotation: 3D design softwares where users can draw 3D objects and work with layers (that would translate close to the labeling experience for us). Based on the research, I came up with the following user persona to guide my upcoming feature brainstorming sprint.
Finding the success metrics: Converging Business & User Goals
**User goals collected from survey responses received by 15 Hive Micro workers who are top users of the platform
Sketching conceptsbased on initial assumptions and goals
After creating user personas and empathy mapping for 3D labelers, I started sketching out some initial screens to conceptualize layouts and task flows.
Some initial UX challenges faced during wire framing:
Relatively new cognitive schemas: Data labeling in 3D space is a new concept and has a complex learning curve given the new user has to be mindful of the x, y and z axis
Navigation complexity in 360 degree space: Too many angles and directions for a user to keep track and move in, without making a mistake
Error proneness: Existence of multiple layers in a space with elevation can cause not only cognitive overload but also an easily lost sense of control on the space with rotation but also making the users more prone to errors since each frame is only slightly modified, realigned version of the previous frame
Screen fatigue: How must we keep sustained user focus while not letting them burn out while completing the redundant job with many frames to organize?
Large datasets to label: Users will be labeling on an average at least 10-15 frames at time, giving them the opportunity of losing track of objects in the scene
Features & UI addressing success and user feedback
Camera navigation: Allow for on screen and shortcut keys for position (elevation x,y), angle (x,y,z), rotation (360), video game joy stick for mental schemas for seamless interaction
Color Customization and Coding: Color coding the labels in the scene with the layers on the sidebar for easy recognition for object in 3D space with other objects in the same space
2D Build Mode: To help workers make fine cuboids for accurate job completion, helping build accurate datasets
Copy Object Layer: Duplicate layer from previous screen to reduce task completion time
Bulk Actions: Deleting a later discovered labeling error from all frames for faster task completion
Tags: Visibility of system status, for example, the tags in each frame match the cuboid color of the object its symbolizing in the scene, mainly a feature to error check at all times by increasing association
Filters: To easily edit a label and recover from making errors while dealing with large sets of similar data
User Flows for the end to end experience
After identifying and designing the desirable features that also align with business goals, we prototyped out an exhaustive list of flows that would be a part of the MVP comprising of key features
Task 1:Label a 3D object that is found for the first time in the task Initial sketches: Use the draw tool to enter build made and place a cuboid, position and size it and place it in the environment, thereafter generating the ID by labeling it
UI Decisions: Use of colors as tags for filtering helps organizing the environment, helping create a freer and more interesting looking space for the user, use of shading also helped user navigate faster in the 3D space and eliminating cognitive barriers arising from too much color use by changing opacity in different states of the flow
Empathy mapping: Applying a phenomenon called schema theory according to which our brain likes to organize knowledge into meaningful units. By separating the build mode and general mode, the users are immediately persuaded to take particular action and remain in a particular state unless they make an effort to prompt themselves out of the state
Success Metrics - Retention rate on a task - Completion rate on a task
Task 2: Find a label and modify it Initial sketches: Filter placeholders
UI Decisions: Adding icons to identify labels by category and colors to change a label color for easy location in all the frames going forward
Organized physical space: Easy change of labels helps with screen fatigue and reduces user frustrations of finding objects in a large evolving space Success Metrics - Engagement rate on a task - Success rate on a task
Task 3: How can we make it easy for a user to copy already generated IDs from previous frames? User Testing Feedback: Pain point found where users found it taxing to go back and forth copying objects from one layer to another
UI Decisions: Adding a window to place, review, edit objects from previous frame
Frictionless labeling: Faster labeling by
Success Metrics - Engagement rate on a task - Success rate on a task
Task 3: Remove an ID Initial sketches: Delete button for an object
UI Decisions: Adding a confirmation screen to reduce user errors
User Testing Feedback: Most jobs will have objects be repeated in multiple frames so I want to delete an object I later discovered I marked in frames it did not exist
User error prevention and recovery: Designing bulk actions for redundant actions since each frame will only be a modified version of the others - can help reduce user exhaustion and sustain energy levels on the task Success Metrics - Completion rate on a task - Success rate on a task
Next steps & Dev KickOff
User Testing for next release: QA testing with our India team with specific task flows and interview calls. Metrics used to measure success were learnability and desirability over other jobs with similar earnings.
Brainstorming nice to haves: Semantic segmentation screen mode and user scope of action taking
Prepare dev handoff: UI styleguide and component library
Key Takeaways & Learnings
While working with new concepts, shorter feedback loops help: Data labeling in 3D spaces is a relatively new concept to build, something that is not existing in user schemas and requires a learning curve similar to how designers develop a basic understanding of design softwares. Key learning in such a project was that intial assumptions can develop not only regarding user experiences but basic task flows also while working with many elements. Testing at each step helps us to either confirm or deny our initial assumptions
Cross functionality while working with multiple teams: Collaborating with other designers, marketing, engineering and the machine learning team and aligning on feasibility while providing a thoughtful experiencebecame key
The concept of spatial memory: With repeated practice, users developed imprecise memory of objects and annotations in the interface, but still needed additional visual signifiers to help them find a specific object in a group of objects The Zeigarnik Effect: By designing the build mode and converging on the concept of environments for different actions, we created the user Zeigarnik effect. It is when the user starts working on one frame by entering the build mode but is too tired to finish it, the thoughts of the unfinished labeling on the frame continue to pop into their mind even when they've moved on to other things due to screen fatigue. Such thoughts urge the user to stay and finish the task now that they're already in the zone