[Documentation] [TitleIndex] [WordIndex

New in diamondback The redesigned C++ API is now included in cv_bridge. Please see the documentation there; the released version differs slightly from the proposal below.

cv_bridge is the ROS user's gateway to the world of OpenCV, and is used ubiquitously in vision nodes. Although it gets the job done, I've noticed some recurring issues that trip up new (and sometimes old) users. I think the C++ API has enough room for improvement to warrant a redesign.

Problems with the current C++ cv_bridge

  1. Unpredictable memory ownership semantics. This is the biggest problem, and has confused people with strange bugs in the past. For example:

   1 class Example {
   2   sensor_msgs::CvBridge bridge_;
   3   IplImage* saved_image_;
   4 
   5   void imageCallback(const sensor_msgs::ImageConstPtr& image_msg)
   6   {
   7     // Save incoming image for later use
   8     saved_image_ = bridge_.imgMsgToCv(image_msg, "bgr8");
   9   }
  10 
  11   void foo()
  12   {
  13     // On some other event, do something with the most recent saved image
  14     cvSaveImage("foo.jpg", saved_image_);
  15   }
  16 };
  1. More counter-intuitive ownership semantics: the returned IplImage* from imgMsgToCv is actually owned by the bridge, and must not be freed by the user. In practice users look at the cv_bridge code samples; adapt them to actually do something, creating more IplImage* with intermediate results; and then forget to free the images they created.

  2. Need to create a CvBridge instance for each image stream. Due to the above behavior,

   1 void imageCallback(const sensor_msgs::ImageConstPtr& left_msg,
   2                    const sensor_msgs::ImageConstPtr& right_msg)
   3 {
   4   sensor_msgs::CvBridge bridge;
   5   IplImage* left  = bridge.imgMsgToCv(left_msg, "bgr8");
   6   IplImage* right = bridge.imgMsgToCv(right_msg, "bgr8");
   7 }
  1. Lack of const-correctness. imgMsgToCv returns an unqualified IplImage*, which may point to const message data. In that case users are free to modify the original message data, and may not even realize they are doing it. I anticipate this being another source of bugs as more image processing is done in nodelets.

  2. It still uses IplImage*, when we've been actively pushing users towards the much safer and more convenient cv::Mat.

  3. Header and encoding information is not preserved with the returned IplImage*. When publishing, the encoding is specified again as an argument to cvToImgMsg, and the header must be filled in separately. Forgetting to copy the header to the new Image is a common mistake.

Basic proposal

Note: the current implementation does not exactly match the API proposed here, but it's essentially the same.

First we declare an analogue of sensor_msgs/Image that uses a cv::Mat for data storage.

   1 // File cv_bridge/cv_bridge.h
   2 
   3 namespace cv_bridge {
   4 
   5 class CvImage
   6 {
   7 public:
   8   roslib::Header header; //!< ROS header
   9   std::string encoding;  //!< Image encoding ("mono8", "bgr8", etc.)
  10   cv::Mat image;         //!< Image data for use with OpenCV
  11 
  12 protected:
  13   // Allows sharing ownership with sensor_msgs::Image(Const)Ptr
  14   boost::shared_ptr<void const> tracked_object_;
  15 };
  16 
  17 typedef boost::shared_ptr<CvImage> CvImagePtr;
  18 typedef boost::shared_ptr<CvImage const> CvImageConstPtr;
  19 
  20 // ...
  21 
  22 }

ROS message to OpenCV

We will convert sensor_msgs/Image messages into CvImage, which addresses issues 5 and 6.

There are two basic use cases when converting a const sensor_msgs::Image:

  1. We want to modify the data in-place. We have to make a copy.
  2. We won't modify the data. We can safely share instead of copying.

Sharing the message data (vision researchers are neurotic about avoiding unnecessary copies) motivated the complexity of cv_bridge. Most of the problems come from not distinguishing between the two cases. In the proposed API we make them explicit:

   1 // Case 1: Always copy, returning a mutable CvImage
   2 CvImagePtr toCvCopy(const sensor_msgs::ImageConstPtr& source,
   3                     const std::string& encoding = std::string());
   4 CvImagePtr toCvCopy(const sensor_msgs::Image& source,
   5                     const std::string& encoding = std::string());
   6 
   7 // Case 2: Share if possible, returning a const CvImage
   8 CvImageConstPtr toCvShare(const sensor_msgs::ImageConstPtr& source,
   9                           const std::string& encoding = std::string());

The empty default for encoding takes the place of passthrough; unless specified, the CvImage has the same encoding as the source.

In the shared case, we return a pointer to const CvImage to enforce the immutability of the image data (issue 4). If the desired encoding matches that of the source, toCvShare aliases the source image data. It also sets CvImage::tracked_object_ to the source pointer, ensuring the data is not deleted prematurely (issue 1). Note there is no overload taking a const sensor_msgs::Image&, because then there's no way to ensure the Image out-lives the CvImage.

Since there's no bridge object, and ownership is managed within CvImage, we sidestep issues 2 and 3.

OpenCV to ROS message

Now we'll add a couple methods to CvImage to go the other way:

   1 class CvImage
   2 {
   3   sensor_msgs::ImagePtr toRos() const;
   4 
   5   // This overload is intended mainly for aggregate messages such as
   6   // stereo_msgs::DisparityImage, which contains a sensor_msgs::Image
   7   // as a data member.
   8   void toRos(sensor_msgs::Image& ros_image) const;
   9 };

Other functions

Analogous to cv::cvtColor, a convenience function for converting an image to another encoding:

   1 CvImagePtr cvtColor(const CvImageConstPtr& source,
   2                     const std::string& encoding);

And some functions for distinguishing categories of encodings. These might be better located in sensor_msgs/image_encodings.h.

   1 bool isColor(const std::string& encoding);
   2 bool isMono(const std::string& encoding);
   3 bool isBayer(const std::string& encoding);
   4 bool hasAlpha(const std::string& encoding);
   5 int  numChannels(const std::string& encoding);
   6 int  bitDepth(const std::string& encoding);

Example usage

A complete example of a node that draws a circle on images and republishes them.

   1 #include <ros/ros.h>
   2 #include <image_transport/image_transport.h>
   3 #include <cv_bridge/cv_bridge.h>
   4 
   5 image_transport::Publisher pub;
   6 
   7 void imageCallback(const sensor_msgs::ImageConstPtr& msg)
   8 {
   9   cv_bridge::CvImagePtr cv_msg = cv_bridge::toCvCopy(msg, "bgr8");
  10   cv::circle(cv_msg->image, cv::Point(50, 50), 10, CV_RGB(255,0,0));
  11   pub.publish(cv_msg->toRos());
  12 }
  13 
  14 int main(int argc, char** argv)
  15 {
  16   ros::init(argc, argv, "processor");
  17   ros::NodeHandle nh;
  18   image_transport::ImageTransport it(nh);
  19   pub = it.advertise("image_out", 1);
  20   image_transport::Subscriber sub = it.subscribe("image", 1, imageCallback);
  21   ros::spin();
  22 }

A more complicated callback example. In this case we want to use color if available, otherwise falling back to monochrome. We avoid copies for bgr8 and mono8 encodings. In either case, we republish the source image with color annotations.

   1 using namespace cv_bridge;
   2 
   3 void imageCallback(const sensor_msgs::ImageConstPtr& msg)
   4 {
   5   bool is_color = isColor(msg->encoding);
   6   CvImageConstPtr source = is_color ? toCvShare(msg, "bgr8")
   7                                     : toCvShare(msg, "mono8");
   8 
   9   // Do vision processing on source...
  10 
  11   // Now we create and publish an annotated color image
  12   CvImagePtr display = cvtColor(source, "bgr8");
  13   // Draw detected objects, etc. on display->image...
  14   pub.publish(display->toRos());
  15 }

Backwards compatibility

Since the proposal defines entirely new data structures and functions, the original sensor_msgs::CvBridge can continue to exist as-is. Eventually it would be deprecated and removed.

Possible extensions

Bayer support

The current cv_bridge doesn't support the Bayer encodings, as debayering should normally be done by image_proc. But maybe that would be a nice convenience?

Using CvImage as a message type

By hooking into the roscpp serialization API, it's possible to make CvImage a real ROS message, wire-compatible with sensor_msgs/Image. This is already implemented in the cv_bridge_redesign package, and it's quite nice; you can use callbacks of the form

void imageCb(const cv_bridge::CvImageConstPtr& image_msg);

and publish CvImage directly. No toCvCopy/toCvShared/toRos required! And when publishing, we avoid the copy from converting to sensor_msgs::Image.

The big issue is how to make this play nice with image_transport. image_transport could be generalized, by templating the publish() and subscribe() methods, but it's somewhat complicated and would require more coding effort.


2024-12-07 14:43