Wednesday, February 10, 2010

Boost Python and Custom Classes

Yesterday I started messing around with exposing our XML parsing code written in C++ to python using Boost.Python. I've gone down this road before, but never exposed a complex class structure to python. About 15 minutes into starting this task I realized that I would need to do that, I had no idea it was going to cause so many issues.

Simply exposing a class and the needed functions wasn't exactly straight forward. For about a day and a half I kept running into compiler issues with boost. Mainly involving compile errors along the lines of "specify_a_return_value_policy_to_wrap_functions_returning" in boost.

Scouring the interwebs for hours really didn't seem to yield many results, but I finally stumbled across the solution. It all has to do with defining the functions inside of the module. For example this is what my module looked like before I found my solution:


class_<parse_xml, boost::noncopyable>("parse_xml", boost::python::no_init)
.def("find_element", &parse_xml::find_element)
.def("find_next_element", &parse_xml::find_next_element)
.def("get_element_data", &parse_xml::get_element_data)
;

class_<xml_element>("xml_element");

def("get_filepath_from_vac", get_vdir_path_cf);
def("parse_chunkx", parse_chunkx);



Now this obviously wasn't working like I stated above, it turns out the key is to add another parameter to the .def function, boost::python::return_internal_reference<>().
What this does is tells boost that this function is returning a non-standard type. In my case it is returning a custom class. So, the moral of the story is, if you are using boost.python to expose a class or a function that returns a custom datatype, make sure you include boost::python::return_internal_reference<>() in the def arguments.

This is what my code looked like afterwards:


class_<parse_xml, boost::noncopyable>("parse_xml", boost::python::no_init)
.def("find_element", &parse_xml::find_element, boost::python::return_internal_reference<>())
.def("find_next_element", &parse_xml::find_next_element,boost::python::return_internal_reference<>())
.def("get_element_data", &parse_xml::get_element_data,boost::python::return_internal_reference<>())
;

class_<xml_element>("xml_element");

def("get_filepath_from_vac", get_vdir_path_cf);
def("parse_chunkx", parse_chunkx, boost::python::return_internal_reference<>());



Another thing that I've run into on this is trying to access parameters on an exposed struct wasn't working as expected. It comes down to the fact that if you have a char * that you want to access in python, you simply can't without having a wrapper for it. When trying to access a char *, you lose the sizing information, which makes it basically useless to python.

So, to access a char * member, you need to write a wrapper function that returns a const char *. For example:



const char *parse_xml::get_value_from_element(xml_element *root) {
return root->text;
}



Now I can easily access the member variable I was going after.

No comments:

Post a Comment