Python bindings using cppyy (was: An update on Python bindings)

Shaheed Haque srhaque at theiet.org
Thu Nov 2 18:22:38 UTC 2017


A progress update...

On 24 October 2017 at 13:05, Shaheed Haque <srhaque at theiet.org> wrote:
> Hi all,
>
> I have a preliminary version of the Cppyy bindings generator CMake
> support available here:
>
>     https://bitbucket.org/wlav/cppyy-backend/pull-requests/6/an-interim-experimental-version-of-a/diff
>
> There are some TODOs yet to be addressed,

The original TODOs and bugs have been resolved, and there is the
beginnings of support for packaging frameworks under a Python
namespace as in "KF5.KDCRAW". Also, as a significant datapoint, I'm
close [1] to being able to generate a *complete* set of bindings for
all of Akonadi driven from CMake with just 2-3 lines of custom logic.
This contrasts with the 549 SLOC of customisation needed to produce a
substantially slash-and-burned subset of Akonadi [2] with the
SIP-based approach.

> but I would appreciate
> feedback on how easy it would be to integrate this with KDE's
> buildsystem, especially for the frameworks. I'm a CMake noob, but the
> basic idea I have is that the packager of some_framework might do
> something like this:
>
> find_package(cppyy)
> CPPYY_ADD_BINDINGS(
>     ...
>     LINK_LIBRARIES some_framework_LIBRARIES
>     H_DIR some_framework_INCLUDE_DIRS
>     H_FILES <list_of_h_files>)

In the course of working through the "KF5" namespace implementation,
it has become apparent to me that a framework-by-framework integration
of the binding generation logic (as previously pioneered by Steve)
probably cannot work in general because there are cases where multiple
frameworks contribute to to the same C++ namespace, for example:

$ grep -r '^namespace Akonadi' /usr/include/KF5/Akonadi*
/usr/include/KF5/AkonadiAgentBase/resourcesettings.h:namespace Akonadi
...
/usr/include/KF5/AkonadiCore/agentfilterproxymodel.h:namespace Akonadi
...
/usr/include/KF5/AkonadiSearch/Debug/akonadisearchdebugsearchpathcombobox.h:namespace
Akonadi
...
/usr/include/KF5/AkonadiWidgets/agenttypedialog.h:namespace Akonadi
...
/usr/include/KF5/AkonadiXml/xmldocument.h:namespace Akonadi
...

The problem is that the Python implementation of these namespaces is a
class, and so treating these frameworks (let's not quibble over
whether KF5Akonadi* are truly KF5 frameworks, the point is more
general) as separate would result in multiple colliding Python class
definitions. The only solution I can see would be to bundle all of
KF5Akonadi* into a single set of bindings, e.g. KF5.Akonadi, and
AFAICS, this can only be done out of tree from the individual
frameworks, say in kde-bindings.git [3].

The work to date attempts to maintain a clean separation such that all
C++ builds are done from CMake, and all Python builds are done using
setuptools/pip.

Apart from working through bugs [1], the remaining work items I can
think of are, as before:

>> - Need to look into the exact usage of Qt-specifics: signals/slots and
>> interoperability with SIP-based PyQt
>> (https://root.cern.ch/root/htmldoc/guides/users-guide/PythonRuby.html#glue-ing-applications,
>> https://root.cern.ch/doc/v606_ORIG/guide/ROOTandQt.html)
>>
>> - Need to figure out how any customisations which *are* required
>> should be handled.

plus:

- I'm working with upstream on how to support discovery (e.g. via
autocompletion in Python3). There is some POC-level hackery in git as
above, but there is work ongoing with upstream to find a robust
solution.

- Flesh out how to make one set of bindings depend on another (e.g.
tier 2 framework bindings might depend on tier 1 bindings, or maybe it
is better to avoid PyQt and just produce cppyy-based bindings for Qt
and depend on those).

As always, comments/ideas/suggestions are welcome.

Thanks, Shaheed

[1] There is a bug with namespaced externs being worked on with upstream.

[2] https://github.com/ShaheedHaque/extra-cmake-modules/blob/shaheed_master/find-modules/module_generation/PyKF5/Akonadi.py

[3] I attach an example CMakeLists.txt which shows now this can be
driven from CMake for the case of KF5Akonadi*...the implementation is
intended to serve as the basis for a generic solution usable across
KF5 at least.

> On 16 October 2017 at 16:16, Shaheed Haque <srhaque at theiet.org> wrote:
>> As promised, here is an interim update on the investigation into the
>> use of cppyy-based bindings for KF5 (and more...) instead of SIP-based
>> bindings.
>>
>> The first thing is that the underlying technology of cppyy,
>> cling/ROOT, has been under development at CERN for quite a while. It
>> directly reads regular C++ files (there is no intermediate format like
>> SIP).
>>
>> The bindings it generates from Python to C++ seem far more complete
>> and automatic than SIP. For example:
>>
>> - Template instantiation is done on the fly as needed.
>>
>> - Since it uses C++ directly, there is none the effort required to
>> decollide SIP's notion of forward and duplicate declarations.
>>
>> - Function overloads are cleanly handled, as are most (all?) operators.
>>
>> The net result is that so far, there is about 3 days work and
>> approximately [1] no "customisation" required in order to get to
>> roughly where the SIP based bindings were after 18 months. Without the
>> need for customisations on a mass scale, I suspect that we might get
>> away without anything like the tooling I had to create to SIP, and
>> just integrate with CMake
>> (https://root.cern.ch/how/integrate-root-my-project-cmake).
>>
>> This all sounds pretty amazing, right? Well, there are a few caveats...
>>
>> - The packaging is pretty new, and is evolving pretty rapidly. We
>> are/will be an early adopter (https://bitbucket.org/wlav/cppyy/ and
>> https://bitbucket.org/wlav/cppyy-backend). Packaging is via PyPI and
>> pip/pip3.
>>
>> - There is a lot of documentation around for the system overall, but
>> frankly, it has  been/still is a struggle to understand how the
>> different parts relate to each other as some parts are obsolete, and
>> other parts have yet to be built out to their intended end-state.
>>
>> - There are bugs [1], [2]. The upstream dev has been very responsive,
>> and the overall quality approach looks sound. IIUC, the vast bulk of
>> the code seems to be in daily use at CERN (and is based on LLVM).
>>
>> - Need to look into the exact usage of Qt-specifics: signals/slots and
>> interoperability with SIP-based PyQt
>> (https://root.cern.ch/root/htmldoc/guides/users-guide/PythonRuby.html#glue-ing-applications,
>> https://root.cern.ch/doc/v606_ORIG/guide/ROOTandQt.html)
>>
>> - Need to figure out how any customisations which *are* required
>> should be handled.
>>
>> These seem like perfectly tractable issues, and so I conclude that
>> using cppyy is definitely the way to go. With luck and a bit of
>> effort, I am hopeful that we can get to some REALLY
>> easy-to-develop-and-maintain bindings.
>>
>> [1] There is a bug with the binding producing stuff for private definitions.
>>
>> [2] There is a bug with missing globals.
>>
>>
> [snip]
-------------- next part --------------
cmake_minimum_required(VERSION 3.9)

find_package(Cppyy)
find_package(ECM REQUIRED NO_MODULE)
set(CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR} ${ECM_MODULE_PATH})
include(FeatureSummary)
include(FindPkgConfig)
include(CMakeFindDependencyMacro)

#
# Find the targets and dependencies for a KDE component.
#
set(_DEPENDENCIES)
function(get_kf5_info component)
    find_dependency(${component})
    set(real_targets)
    set(real_dependencies)
    #
    # Loop over all cmake files.
    #
    set(file_glob  ${${component}_DIR}/*.cmake)
    file(GLOB files ${file_glob})
    foreach(f ${files})
        #
        # Targets.
        #
        file(STRINGS ${f} matches REGEX "^ *add_library\\(.*\\) *")
        if(NOT matches STREQUAL "")
            foreach(target ${matches})
                string(REGEX REPLACE " *add_library\\(([^ \\)]+).*" "\\1" target ${target})
                if(TARGET ${target})
                    list(APPEND real_targets ${target})
                    list(REMOVE_DUPLICATES real_targets)
                else()
                    message(STATUS "Ignoring invalid target \"${target}\" for ${component} in ${f}")
                endif()
            endforeach()
        endif()
        #
        # Dependencies.
        #
        file(STRINGS ${f} matches REGEX "^ *find_dependency\\(.*\\) *")
        if(NOT matches STREQUAL "")
            foreach(dependency ${matches})
                string(REGEX REPLACE " *find_dependency\\(([^ \\)]+).*" "\\1" dependency ${dependency})
                if(NOT ${dependency} STREQUAL "")
                    list(APPEND real_dependencies ${dependency})
                    list(REMOVE_DUPLICATES real_dependencies)
                    #
                    # Recurse...if we have not been here before.
                    #
                    string(FIND ${dependency} "KF5" found)
                    if(found EQUAL 0 AND NOT dependency IN_LIST _DEPENDENCIES)
                        get_kf5_info(${dependency})
                        list(APPEND real_dependencies ${dependencies})
                        list(REMOVE_DUPLICATES real_dependencies)
                    endif()
                else()
                    message(STATUS "Ignoring invalid dependency \"${dependency}\" for ${component} in ${f}")
                endif()
            endforeach()
        endif()
    endforeach()
    set(targets "${real_targets}" PARENT_SCOPE)
    set(dependencies "${real_dependencies}" PARENT_SCOPE)
endfunction(get_kf5_info)

#
# Find the targets and dependencies for a Qt component.
#
function(get_qt5_info component)
    find_dependency(${component} NO_MODULE)
    #
    # Targets.
    #
    string(REPLACE "Qt5" "Qt5::" target ${component})
    if(NOT TARGET ${target})
        message(STATUS "Ignoring invalid target \"${target}\" for ${component}")
        set(target "")
    endif()
    set(targets "${target}" PARENT_SCOPE)
    #
    # Dependencies.
    #
    set(dependencies "" PARENT_SCOPE)
endfunction(get_qt5_info)

#
# Fetch a target property, recursing if necessary.
#
function(get_target_property_recursive target property)
    set(result)
    get_target_property(values ${target} ${property})
    if(values STREQUAL "values-NOTFOUND")
        # Skip
        # message(STATUS "Warning: Target ${target} has no property ${property}")
    else()
        foreach(value ${values})
            string(FIND ${value} "$<TARGET_PROPERTY:" found)
            if(found EQUAL 0)
                #
                # Recurse. The format is:
                #
                # $<TARGET_PROPERTY:KF5::WebKit,INTERFACE_INCLUDE_DIRECTORIES>
                #
                string(REGEX REPLACE "\\$<TARGET_PROPERTY:(.*),(.*)>" "\\1" nested_tgt ${value})
                string(REGEX REPLACE "\\$<TARGET_PROPERTY:(.*),(.*)>" "\\2" nested_prop ${value})
                get_target_property_recursive(${nested_tgt} ${nested_prop})
                list(APPEND result ${get_target_property_recursive_result})
            else()
                list(APPEND result ${value})
            endif()
        endforeach()
    endif()
    set(get_target_property_recursive_result "${result}" PARENT_SCOPE)
endfunction(get_target_property_recursive)

#
# Find the includes, libraries etc. for a component.
#
function(get_targets_info component targets)
    if(targets STREQUAL "")
        message(STATUS "Warning: No targets for ${component}")
        return()
    endif()
    #
    # Make a combined list of includes, libraries etc.
    #
    # There is a potential impedence mismatch between the directory-centric
    # Pythonic notion of a package, and the possibility that the the multiple
    # targets *might* have conflicting options. Luckily, this seems not to be
    # a problem in KF5.
    #
    set(libraries)
    set(includes)
    set(compile_flags)
    foreach(target ${targets})
        if(TARGET ${target})
            get_target_property(tmp ${target} LOCATION)
            list(APPEND libraries ${tmp})
            get_target_property_recursive(${target} INTERFACE_INCLUDE_DIRECTORIES)
            list(APPEND includes ${get_target_property_recursive_result})
            get_target_property_recursive(${target} INTERFACE_COMPILE_DEFINITIONS)
            foreach(definition ${get_target_property_recursive_result})
                if(${definition} MATCHES ".*QT_NO_DEBUG>")
                    #
                    # Qt uses the formulation "$<$<NOT:$<CONFIG:Debug>>:QT_NO_DEBUG>".
                    #
                elseif(${definition} MATCHES "QT_.*_LIB")
                    #
                    # Qt uses the formulation "QT_CORE_LIB" even for INTERFACE_COMPILE_FLAGS.
                    #
                else()
                    list(APPEND compile_flags "-D${definition}")
                endif()
            endforeach()
            get_target_property_recursive(${target} INTERFACE_COMPILE_OPTIONS)
            list(APPEND compile_flags ${get_target_property_recursive_result})
        else()
            message(STATUS "Warning: Ignoring invalid target \"${target}\" in ${f}")
        endif()
    endforeach()
    #
    # De-duplicate and write results.
    #
    if(DEFINED includes)
        list(REMOVE_DUPLICATES includes)
        #
        # Not sure why the headers seem to include this.
        #
        list(REMOVE_ITEM includes "/usr/include")
    endif()
    if(DEFINED compile_flags)
        list(REMOVE_DUPLICATES compile_flags)
    endif()
    set(libraries "${libraries}" PARENT_SCOPE)
    set(includes "${includes}" PARENT_SCOPE)
    set(compile_flags "${compile_flags}" PARENT_SCOPE)
endfunction(get_targets_info)

#
# Find the includes, libraries etc. for a pkg-config component.
#
function(get_pkgconfig_info component)
    set(libraries)
    set(includes ${${component}_INCLUDEDIR})
    set(compile_flags ${${component}_CFLAGS})
    foreach(tmp ${${component}_LIBRARIES})
        find_library(lib${tmp} NAMES ${tmp} PATHS ${${component}_LIBRARIES})
        list(APPEND libraries ${lib${tmp}})
    endforeach()
    set(libraries "${libraries}" PARENT_SCOPE)
    set(includes "${includes}" PARENT_SCOPE)
    set(compile_flags "${compile_flags}" PARENT_SCOPE)
endfunction(get_pkgconfig_info)


#
# Return the information required to create the bindings for a set of KF5 components.
#
#   get_kf5_binding_info(
#       COMPONENTS components
#       DEPENDENCIES extras)
#
# Arguments and options:
#
#   COMPONENTS component
#                       The CMake packages to include in the bindings.
#
#   DEPENDENCIES dependency
#                       Any CMake packages not detected by the automatic
#                       dependency extraction logic.
#
function(get_kf5_binding_info)
    cmake_parse_arguments(
        ARG
        ""
        ""
        "COMPONENTS;DEPENDENCIES"
        ${ARGN})
    if(NOT "${ARG_UNPARSED_ARGUMENTS}" STREQUAL "")
        message(SEND_ERROR "Unexpected arguments specified '${ARG_UNPARSED_ARGUMENTS}'")
    endif()
    if("${ARG_COMPONENTS}" STREQUAL "")
        message(SEND_ERROR "No COMPONENTS specified")
    endif()
    #
    # Find dependencies and other info.
    #
    set(_H_DIRS)
    set(_H_FILES)
    set(_COMPILE_OPTIONS)
    set(_INCLUDE_DIRS)
    set(_LINK_LIBRARIES)
    foreach(component IN LISTS ARG_COMPONENTS)
        get_kf5_info(${component})
        #
        # Automatic dependencies.
        #
        list(APPEND _DEPENDENCIES ${dependencies})
        list(REMOVE_DUPLICATES _DEPENDENCIES)
        #
        # Other info.
        #
        get_targets_info(${component} ${targets})
        list(APPEND _H_DIRS ${includes})
        list(APPEND _LINK_LIBRARIES ${libraries})
        list(APPEND _COMPILE_OPTIONS "${compile_flags}")
        list(REMOVE_DUPLICATES _H_DIRS)
        list(REMOVE_DUPLICATES _LINK_LIBRARIES)
        list(REMOVE_DUPLICATES _COMPILE_OPTIONS)
    endforeach(component)
    #
    # Find all header files.
    #
    foreach(h_dir IN LISTS _H_DIRS)
        file(GLOB tmp ${h_dir}/*.h)
        list(APPEND _H_FILES ${tmp})
        list(REMOVE_DUPLICATES _H_FILES)
    endforeach(h_dir)
    list(FILTER _H_FILES EXCLUDE REGEX ".*_version.h")
    #
    # Add dependencies.
    #
    foreach(component IN LISTS _DEPENDENCIES ARG_DEPENDENCIES)
        string(FIND ${component} "KF5" found_kf5)
        string(FIND ${component} "Qt5" found_qt5)
        if(component MATCHES "^KF5")
            get_kf5_info(${component})
        elseif(component MATCHES "^Qt5")
            get_qt5_info(${component})
        endif()
        get_targets_info(${component} "${targets}")
        list(APPEND _INCLUDE_DIRS ${includes})
        list(APPEND _LINK_LIBRARIES ${libraries})
        list(APPEND _COMPILE_OPTIONS "${compile_flags}")
        list(REMOVE_DUPLICATES _INCLUDE_DIRS)
        list(REMOVE_DUPLICATES _LINK_LIBRARIES)
        list(REMOVE_DUPLICATES _COMPILE_OPTIONS)
    endforeach(component)
    #
    # Find the version from the first component.
    #
    list(GET ARG_COMPONENTS 1 first_component)
    include(${${first_component}_DIR}/${first_component}ConfigVersion.cmake)
    #
    # Return results.
    #
    set(version ${PACKAGE_VERSION} PARENT_SCOPE)
    set(h_dirs ${_H_DIRS} PARENT_SCOPE)
    set(h_files ${_H_FILES} PARENT_SCOPE)
    set(include_dirs ${_INCLUDE_DIRS} PARENT_SCOPE)
    set(compile_options ${_COMPILE_OPTIONS} PARENT_SCOPE)
    set(link_libraries ${_LINK_LIBRARIES} PARENT_SCOPE)
    #message("version=${PACKAGE_VERSION}")
    #message("h_dirs=${_H_DIRS}")
    #message("h_files=${_H_FILES}")
    #message("include_dirs=${_INCLUDE_DIRS}")
    #message("compile_options=${_COMPILE_OPTIONS}")
    #message("link_libraries=${_LINK_LIBRARIES}")
endfunction(get_kf5_binding_info)

#
# Main code.
#
get_kf5_binding_info(
    COMPONENTS KF5Akonadi KF5AkonadiCalendar KF5AkonadiContact KF5AkonadiMime KF5AkonadiNotes KF5AkonadiSearch
    DEPENDENCIES KF5Konq)
list(FILTER h_files EXCLUDE REGEX ".*qtest_akonadi.h")
list(FILTER h_files EXCLUDE REGEX ".*/KF5/[^/]+.h")
CPPYY_ADD_BINDINGS(
    "KF5.Akonadi" "${version}" "Shaheed" "srhaque at theiet.org"
    LANGUAGE_STANDARD "14"
    GENERATE_OPTIONS "-D__PIC__;-Wno-macro-redefined"
    INCLUDE_DIRS ${include_dirs}
    LINK_LIBRARIES ${link_libraries}
    H_DIRS ${h_dirs}
    H_FILES ${h_files})


More information about the Kde-bindings mailing list