Robotic Arm with Computer Vision

Robotic Arm with Computer Vision - Picking Up the Object


Idea

The main idea behind this project was to create an environment where a robotic arm can execute various commands based on image analysis of a scene. In this article, I will describe each part of the project in detail. For the first task, I focused on detecting and moving a single object.



Environment

The setup consists of several components assembled together. I used an old table as the base, repainting it white to provide better contrast with the objects. The robotic arm, which I purchased from eBay, is mounted on the middle of the longer side of the table. The arm has six servo motors, including a rotating base and claws at the other end. The parts are made of aluminum and are quite sturdy.

Next, I cut and mounted perforated metal ledges to the corners of the table, securing everything together. I then attached an RGB LED strip to the bottom side of the top part of the structure. Finally, I placed a USB camera at the top of the setup to capture the entire scene.


Communication with Arm

The robotic arm is equipped with six servo motors. The quickest way to control these motors is through a servo controller, which allows for the management of individual or grouped servos. I chose a controller with a serial output and a custom protocol, enabling communication via USB with just a few lines of code in any programming language.

Example of a group operation:
#1P1500#2P2300#3P2300#4P2300#5P2300#6P2300T100\r\n
#servo-indexPtime-in-milis-as-rotation#servo-indexPtime-in-milis-as-rotation...Ttime-for-execution\r\n


Logic flow

The application consists of several independent components that communicate with each other.

  • The input from the camera runs in a separate thread, where it preprocesses the "interesting frame" (the frame detected after movement). The result of this preprocessing is a list of detected objects with their coordinates.
  • The interesting frame is then sent to the main logic module. Here, all the other modules are registered. If no module is active, the system tries to initialize the first module that satisfies the initial conditions. If a module is already active, the interesting frame is sent to that module. The module handles the logic and decides what to do next. It then sends movement commands to a queue for the USB communicator.
  • The USB communicator repeatedly reads messages from the queue and sends commands to the controller via USB. The controller then moves the servo motors accordingly.
schema of logic flow

Calculation of the Next Move

One of the most frequently used features will be picking up objects. Once the camera provides preprocessed input, the next step is to calculate the movements required to pick up the object. At this point, we have a frame with the detected object and the center of the arm. We also know the real dimensions of the table, the length of the arm’s segments, and the base height. The task is to calculate the angles for each servo in the arm so it can reach and pick up the object. We can break this problem down into two smaller geometric problems.

  • The first part involves base rotation (imagine viewing the arm from above). This is a trigonometry exercise where we know all the points of a triangle and two sides, and we need to calculate the angle. We then convert this angle to milliseconds for the servo controller.

  • The second part involves rotating three servos to tilt the arm (imagine viewing the arm from the side). In the initial version, the height of the object is unknown, so we use a constant. This problem is similar to the previous trigonometry problem, but here we have one side and angle for each segment. By substituting the right values, we can determine if the arm has reached the object. I used brute force to calculate the three angles, and the process took less than a second.



Modules

The goal is to create an application with easily insertable modules. Each module can be thought of as a series of moves with custom logic, extended from a template class. Each module is defined by a list of states. A state can change based on one of the following triggers:

  • Time Trigger: Wait for a specific time before executing the next move.
  • Interesting Frame Trigger: When movement is detected by the camera.
  • Command Execution Trigger: Wait for a signal from the USB controller indicating that the move was executed.

For example, a module for picking up an object could have the following states:

  1. Start (triggered by an interesting frame)
  2. Pick up the object (triggered by time)
  3. Move the object (triggered by time)
  4. Release the object and return to the default position (triggered by time)
  5. Verify if the object was moved (triggered by command execution)

First testing

While working with C++ and OpenCV, I encountered a small issue: you cannot display an image from a background thread; only the main thread can call the imshow() method. To solve this, I used a singleton instance to store the images, and the main thread then displays them.

One of the unresolved issues is detecting the object's height. It’s not possible to determine the height from a single camera. This could potentially be addressed by using a sensor at the end of the arm or another approach.

Despite these challenges, the initial tests for picking up objects were successful. However, more calibration is required. After calibration, machine learning could be used to identify the optimal spot on the object for successful picking. Additionally, the claws need to grip the object tightly enough to prevent it from slipping.

You can find the source code here: https://github.com/mbodis/rawcv




Comments

Popular posts from this blog

Play table

Counting dice and train wagons using computer vision

Skate Tricks Recognition Using Gyroscope