Robotic arm with computer vision

Robotic arm with computer vision - picking up the object


The main idea was to build an environment with a robotic arm that can execute various commands based on an image analysis of a scene. In this article I'm going to describe all parts of the idea. For the first task I've chosen detection and moving one object.


The whole environment consists of few parts mounted together. For the base I've chosen an old table and repainted it with a white color to get better contrast with objects. Onto the middle of longer side I mounted robotic arm that I got from e-bay. The arm has 6 servo motors, with rotation base and claws on the other size. Parts are made of aluminium and are quite solid. Then I got some perforated metal ledges, short them, and mounted them to the corners of the table. Screw it all together. Then I put RGB Led strip to the bottom side of top part of construction. In the end i placed USB camera at the top of construction so it can see the whole scene.

Communication with arm

The robotic arm has 6 servo motors. The quickest way is to get a servo controller which allows us to control one or group of servos. I chose controller with a serial output with custom protocol, so the communication can be done via USB via few lines of code in any language.

Example of a group operation:

Logic flow

The application has few independent parts that communicate with each other.

The input from camera is running in separate thread and runs a preprocessing on interesting frame. (Interesting frame is a frame that was detected after the movement.) The result of preprocessing is list of detected objects with coordinates.
The interesting frame is then sent to main logic. This is where all the modules are registered. If there is no active module, then it tries to initialize first that satisfied initial conditions. If any module is active, interesting image is send to this module. Modules takes care of logic and decides about what to do next. Module then sent movement commands into to queue for usb communicator.
The usb communicator repeatidly reads messages from its queue and sends commands to controller via USB. Controller then moves with the servo motors.

schema of logic flow

Calculation of the next move

One of the most frequently used feature will be picking up the object. After we get preprocessed input from the camera we have to calculate the move to pick up the object. So now we have frame with detected object and center of arm. Next we know the real size of the table, length of arm parts and base height. Our task is to calculate angles for each servo in arm so it can be able to reach and pick up the object. We can split this problem into two smaller problems. Each part has got a little bit of a geometry character.

First part will be the base rotation (we can imagine this as a view from the top). This is trigonometry exercise where we know all the points of a triangle, two sides and we want to calculate the angle. Then we recalculate the angle to miliseconds for servo controller.

Second part will be the rotation of three servos to lean the arm (we can imagine this as a view from the side). In the first version we do not know the height of the object so we use the constant instead. The problem is very similar to the previous trigonometry problem. This time we have one side and angle for each part. So if we substitute the right three values we know if we reached our object. I've used brute force to calculate the three angles (time was less than one second).


The idea is to create an application with easily insert able modules. Each module we can imagine as series of a moves with custom logic. These modules are extended from template class. Each module is defined by the list of states.
Each state can be changed for one of the following triggers:

time trigger - wait some time to do next move
interesting frame trigger - when movement from the camera is detected
command execution trigger - send broadcast from USB controller that move was executed
So the application would have all the logic for each task, separated in custom module. For example the module for picking up the object can have the following states:

start (interesting frame trigger)
pick up object (time trigger)
move object (time trigger)
release object and return to default position (time trigger)
verify if object was moved (wait if the last move is executed - command execution trigger)

First testing

Small issue that I came across writing with c++ openCV was, that you can not show an image from the background thread, only main thread can call imgshow() method. So I used singleton instance that keeps images and main thread afterwards shows the image.

One of the open problems is detecting the object's height. It's not possible to detect object's height from a single camera. There could be used some sensor at the end of arm or other approach.

Even though, the first tests for picking up the object were successful, more calibration is required. After this calibration there could be used learning for the best object's spot for successful picking. Also the claws should enclose the object hard enough not to slip.


Popular posts from this blog

YI Dash Car Camera API

Toilet light wall

Working with android contacts