[rkward-cvs] SF.net SVN: rkward:[3253] branches/jss_dec_10/FINAL_JSS_TEX
tfry at users.sourceforge.net
tfry at users.sourceforge.net
Thu Dec 16 16:07:50 UTC 2010
Revision: 3253
http://rkward.svn.sourceforge.net/rkward/?rev=3253&view=rev
Author: tfry
Date: 2010-12-16 16:07:50 +0000 (Thu, 16 Dec 2010)
Log Message:
-----------
Initial conversion of 'technical design'-section. Figure still missing.
Modified Paths:
--------------
branches/jss_dec_10/FINAL_JSS_TEX/RKWard_paper.tex
branches/jss_dec_10/FINAL_JSS_TEX/background.tex
Added Paths:
-----------
branches/jss_dec_10/FINAL_JSS_TEX/technical.tex
Modified: branches/jss_dec_10/FINAL_JSS_TEX/RKWard_paper.tex
===================================================================
--- branches/jss_dec_10/FINAL_JSS_TEX/RKWard_paper.tex 2010-12-16 13:30:44 UTC (rev 3252)
+++ branches/jss_dec_10/FINAL_JSS_TEX/RKWard_paper.tex 2010-12-16 16:07:50 UTC (rev 3253)
@@ -98,7 +98,7 @@
%% work in parallel, easier
\include{background}
%%\include{usage}
-%%\include{technical}
+\include{technical}
%%\include{example_session}
%%\include{example_plugin}
Modified: branches/jss_dec_10/FINAL_JSS_TEX/background.tex
===================================================================
--- branches/jss_dec_10/FINAL_JSS_TEX/background.tex 2010-12-16 13:30:44 UTC (rev 3252)
+++ branches/jss_dec_10/FINAL_JSS_TEX/background.tex 2010-12-16 16:07:50 UTC (rev 3253)
@@ -65,7 +65,7 @@
GFDL (GNU Free Documentation License) licensed. While the project remains in constant development, a growing
number of users employs RKWard in productive scenarios. The source code,
selected binaries and documentation is hosted at SourceForge
-(http://sourceforge.net/). Some key milestones of the development of RKWard are
+(\url{http://sourceforge.net/}). Some key milestones of the development of RKWard are
visualized in Figure~\ref{fig:timeline}.
\begin{figure}[htp]
Added: branches/jss_dec_10/FINAL_JSS_TEX/technical.tex
===================================================================
--- branches/jss_dec_10/FINAL_JSS_TEX/technical.tex (rev 0)
+++ branches/jss_dec_10/FINAL_JSS_TEX/technical.tex 2010-12-16 16:07:50 UTC (rev 3253)
@@ -0,0 +1,356 @@
+\section[technical]{Technical Design}
+In this section we will give a compact overview over key aspects of RKWards
+technical design. We will give slightly more attention to the details of the
+plugin framework used in RKWard, since this is central to the extensibility of
+RKWard.
+
+\subsection[technical_asynchronous]{Asynchronous command execution}
+One central design decision in the implementation of RKWard is that the
+interface to the \proglang{R} engine operates asynchronous. The intention is to
+remain the application usable to a high degree, even during the computation of
+time-consuming analyses. For instance while waiting for the estimation of a
+complex model to complete, the user should be able to continue to use the GUI to
+prepare the next analysis. Asynchronous command execution is also a prerequisite
+for a implementation of the plot-preview feature (see Section~\ref{usage_plotpreview}). Commands
+generated from plugins or user actions are placed in queue and are evaluated in
+a separate thread in the order they were submitted\footnote{
+ It is possible, and in some cases necessary to enforce a different order of command execution in
+ internal code. For instance RKWard makes sure that no user command can
+ potentially interfere while RKWard is loading the data of a \code{data.frame} for
+ editing.
+}. The asynchronous design implies that RKWard avoids to rely on the
+\proglang{R} engine during interactive use. This is one of several reasons for
+the use of \proglang{ECMAScript} in plugins, instead of scripting using
+\proglang{R} (see Sections~\ref{technical_toolkit} and \ref{technical_plugins}).
+A further implication is that RKWard avoids quering information about the
+existence and properties of objects in \proglang{R}, interactively. Rather
+RKWard keeps a representation of \proglang{R} objects and their basic properties
+(e.g. class and dimensions), which is used for the workspace browser (Section~\ref{usage_browser}),
+object name completion, function argument hinting and
+other occasions. The object representation includes objects in all environments
+on the search path, and any objects contained within these environments in a
+hierarchical tree\footnote{
+ Currently, environments of functions or formulas are not taken into account.
+}. The representation of \proglang{R} objects is gathered
+pro-actively. This has a notable impact on performance when loading packages
+(specifically, objects which would usually be ``lazy loaded'' only when needed \citep[see][]{Ripley2004} are
+accessed in order to fetch information on their properties; this means the data
+has to be loaded from disk; however, the memory is freed directly after fetching
+information on the object).
+
+A further side-effect of the asynchronous threaded design is that there is
+inherently a rather clear separation between GUI code and code making direct use
+of the \proglang{R} API. In the current development version, the evaluation
+of \proglang{R} commands has even been moved into a separate process. In the somewhat longer term it could even
+be possible to run GUI and \proglang{R} engine on different computers.
+
+\subsection[technical_omd]{Object modification detection}
+RKWard allows the user to run arbitrary commands in \proglang{R} at any time, even while
+editing a \code{data.frame} or while selecting objects for analysis in a GUI dialog. Any user
+command could potentially add, modify, or remove objects in \proglang{R}. RKWard tries to
+detect such changes in order to always display accurate information in the
+workspace browser, object selection lists, and object views. Beyond that,
+detecting any changes is particularly important with respect to objects which
+are currently opened for editing in the data editor (which provides an illusion
+of in-place editing, see Section~\ref{usage_dataeditor}). Here, it is necessary to synchronize
+the data between \proglang{R} and the GUI in both directions.
+
+For simplicity and performance, object modification detection is only
+implemented for objects inside the ``global environment'' (including in environments
+inside the global environment), since this is where changes are typically done.
+Currently object modification detection is based on active bindings.
+Essentially, any object which is created in the global environment is first
+moved to a hidden storage environment, and then replaced with an active binding.
+The active binding acts as a transparent proxy to the object in the storage
+environment, which registers any write-access to the object\footnote{
+ This is similar to the approach taken in the \pkg{trackObjs} package \citep{Plate2009}.
+}.
+
+The use of active bindings has significant performance implications, when
+objects are accessed very frequently. This is particularly notable where an
+object inside the global environment (i.e. an object wrapped into an active
+binding) is used as the index variable in a loop, as illustrated by the
+following example:
+
+\begin{Code}
+# 'i', created below, will become subject to object modification detection
+# as soon as the user command returns
+i <- 1
+
+# this loop will run slow, since 'i' is stored as an active binding
+for (i in 1:100000) i + i
+
+f <- function () {
+ # this loop will run approximately as fast as in plain R
+ # 'i' is a local object in this function, and not subject
+ # to object modification detection
+ for (i in 1:100000) i + i
+}
+f ()
+\end{Code}
+
+It may be possible to overcome this performance problem in future versions of
+RKWard. One approach that is currently under consideration is to simply perform
+a pointer comparison of the SEXP records of objects in global environment with
+their copies in the hidden storage environment. Due to the implicit sharing of
+SEXP records \citep{RDCT2010a, RDCT2010b}, this should provide for a reliable
+way to detect changes for most types of \proglang{R} objects, with comparatively low memory
+and performance overhead. Special handling will be needed for environments and
+active bindings.
+
+\subsection[technical_toolkit]{Choice of toolkit and implementation languages}
+In addition to \proglang{R}, RKWard is based on the \proglang{KDE} libraries, which are in turn based
+on \proglang{Qt}, and implemented mostly in \proglang{C++}. Compared to many competing libraries,
+this constitutes a rather heavy dependency. Moreover, the \proglang{KDE} libraries are
+still known to have portability issues especially on Mac OS, and to some degree
+also on the Windows platform.
+
+The major reason for the choice of the \proglang{KDE} and \proglang{Qt} libraries is that they provide
+many high level features which have allowed RKWard development to make quick
+progress despite limited resources. Most importantly, the \proglang{KDE} libraries provide a
+full featured text editor \citep{CullmannND} as a component which can be
+seamlessly integrated into a hosting application using the KParts technology
+\citep{Faure2000}. Additionally, KPart provides HTML browsing capabilities in a
+similarly integrated way. The availability of kword \citep{KWord} as an
+embeddable KPart might prove useful in future versions of RKWard, when better
+integration with office-suites will be sought.
+
+%% NOTE: It's ``XMLGUI'' in one word, even though it's XML and GUI
+Another technology from the \proglang{KDE} libraries that is important to the development
+of RKWard is the ``XMLGUI''-technology
+\citep{Faure2000}. This is especially helpful in providing an integrated GUI for
+the various components of RKWard.
+
+Plugins in RKWard rely on \proglang{XML} (Extensible Markup Language)\footnote{\url{http://www.w3.org/XML/}}
+and \proglang{ECMAScript}\footnote{\url{http://www.ecmascript.org/}} (see Section~\ref{technical_plugins}). \proglang{XML} is not
+only well suited to describe the layout of the GUI of plugins, but simple
+functional logic can also be represented \citep{Visne2009}. \proglang{ECMAScript} was
+chosen for the generation of \proglang{R} commands within plugins in particular due to its
+availability as an embedded scripting engine inside the \proglang{Qt} libraries. While at
+first glance, \proglang{R} itself would appear as a natural choice of scripting language as
+well, this would make it impossible to use plugins in an asynchronous way.
+Further, the main functional requirement at this place is the manipulation and
+concatenation of text strings. While \proglang{R} provides support for this, concatenating
+strings with the \code{+}-operator, as available in \proglang{ECMAScript}, allows for a much
+more readable way to perform such text concatenation.
+
+\subsection[technical_graphics]{Onscreen graphics windows}
+Contrary to the approach used in \pkg{JGR} \citep{HelbigTheus2005}, RKWard does
+not technically provide a custom on-screen graphics device. RKWard detects when
+new graphics windows are created via calls to \code{X11()} or \code{windows()}. These windows
+are then “captured” in a platform dependent way (based on the XEmbed\footnote{\citep{Ettrich2002}} protocol
+for X11, on reparenting for the Windows platform). An RKWard menu bar and a
+toolbar is then added to these windows to provide added functionality. While
+this approach requires some platform dependent code, any corrections or
+improvements made to the underlying \proglang{R} native devices will automatically be
+available in RKWard.
+
+A recent addition to the on-screen device is the ``plot history'' feature which
+adds a browsable list of plots to the device window. Since RKWard does not use a
+custom on-screen graphics device, this feature is implemented in a package
+dependent way. For example, as of this writing, plotting calls that use either
+the ``standard graphics system'' or the ``\pkg{lattice} system'' can be added to the plot
+history; other plots are drawn but not added. The basic procedure is to identify
+changes to the on-screen canvas and record the existing plot before a new plot
+wipes it out. A single ``global'' history for the recorded plots is maintained
+which is used by all the on-screen device windows. This is similar to the
+implementation in Rgui.exe (Windows platform) but unlike the one in Rgui.app
+(MacOSX platform). Each such device window points to a position in the history
+and behaves independently when recording a new plot or deleting an existing
+plot.
+
+The lattice system is implemented by inserting a hook in the \code{print.lattice()}
+function. This hook retrieves and stores the \code{lattice.status} object from the
+\code{lattice:::.LatticeEnv} environment; thereby making \code{update()} calls on trellis
+objects transparent to the user. Any recorded trellis object is then replayed
+using \code{plot.lattice()} bypassing the recording mechanism. The standard graphics
+system, on the other hand, is implemented differently because the hook in
+\code{plot.new()} is ineffective for this purpose. A customized function is overloaded
+on \code{plot.new()} which stores and retrieves the existing plot, essentially, using
+\code{recordPlot()} and replays them using \code{replayPlot()}.
+
+The actual plotting calls are tracked using appropriate \code{sys.call()} commands in
+the hooks. These call strings are displayed as a drop-down menu on the toolbar
+for non-sequential browsing (see Figure~\ref{fig:plot_history}) providing a very intuitive browsing
+interface unlike the implementation for windows or quartz devices.
+
+\subsection[technical_plugins]{Plugin infrastructure}
+One of the earliest features of RKWard was the extensibility by plugins.
+Basically, plugins in RKWard provide complete GUI-dialogs, or re-useable
+GUI-components, which accept user settings, and translate those user settings
+into \proglang{R} code\footnote{
+ Plugins are also used in some other contexts within RKWard, for instance the
+ kate part supports extensions via plugins and user scripts. At this point we
+ will focus only on plugins generating R code.
+}. Thus, the plugin framework is basically a tool set used to define
+GUIs for the automatic generation of \proglang{R} code. Much of the functionality in RKWard
+is currently implemented as plugins. For example, import of different file
+formats relying on the foreign package is achieved by this approach. Similarly,
+RKWard provides a modest GUI driven tool set for statistical analysis,
+especially for Item response theory (IRT), distributions and descriptive
+statistical analysis.
+
+\subsubsection[technical_plugins_defining]{Defining a plugin}
+Plugins consist of four parts \citep[see Section~\ref{example_plugin} for an example; for a complete
+manual, see][]{Friedrichsmeier2010}:
+
+%% TODO: Make these bullets!
+\begin{itemize}
+ \item
+ An XML file, called a ``plugin map,'' is used to declare one or more plugins, each
+ with a unique identifier. For most plugins, the plugin map also defines the
+ placement in the menu hierarchy. Plugin maps are meant to represent groups of
+ plugins. Users can disable/enable such groups of plugins in order to reduce the
+ complexity of the menu hierarchy.
+
+ \item
+ A second XML file describes the plugin itself. Most importantly this includes
+ the definition of the GUI-layout and GUI-behavior. High level GUI-elements can
+ be defined with simple XML-tags. Layout is based on ``rows'' and ''columns'',
+ instead of pixel-counts. In most cases this allows for a sensible resizing
+ behavior. RKWard supports single-page dialogs, and multi-page wizards, however,
+ most plugins define only a single-page UI. GUI behavior is can be programmed by
+ connecting ``properties'' of the GUI elements to each other. For example the state
+ of a checkbox could be connected to the ``enabled'' property of a dependent
+ control. More complex logic is also supported. Procedural scripting of GUI
+ behavior using \proglang{ECMAScript} is also supported.
+
+ \item
+ A separate \proglang{ECMAScript}-file is used to translate GUI settings into \proglang{R}
+ code\footnote{
+ In earlier versions of RKWard, \proglang{PHP} (PHP: Hypertext Preprocessor) was used
+ as a scripting engine, and \proglang{PHP}-interpreters were run in a separate process.
+ Usage of \proglang{PHP} was abandoned in RKWard version 0.5.3.
+ }. This \proglang{ECMAScript} file is evaluated asynchronously in a separate thread. RKWard
+ currently enforces structuring the code into three separate sections for
+ preprocessing, calculating, and printing results. The generated code is always
+ run in a local environment, in order to allow the use of temporary variables
+ without the danger of overwriting user data.
+
+ \item
+ A third \proglang{XML} file defines a help page. This help page usually links to the \proglang{R} help
+ pages of the central functions/concepts used by the plugin. Compared to \proglang{R} help
+ pages, the plugin help pages try to give more hands-on advice on using the
+ plugin. Plugins can be invoked from their help page by clicking on a link near
+ the top, which can be useful after following a link from a related help page.
+\end{itemize}
+
+Basically the source code of these elements can be changed without a requirement to recompile.
+
+\subsubsection[technical_plugins_embedding]{Embedding and reuse of plugins}
+RKWard supports several mechanisms for modularization and re-use of
+functionality in plugins. File inclusion is one very simple but effective
+mechanism, which can be used in the \proglang{ECMAScript} files but is also supported in
+the \proglang{XML}-files. In script files this is most useful by defining common functions
+in an included file. For the \proglang{XML}-files, the equivalent is to define ``snippets''
+in the included file, which can then be inserted.
+
+A third mechanism allows to completely embed one plugin into another. For
+instance the \code{plot\_options} plugin is used by many plugins in RKWard to provide
+common plot options such as plot labels, axis options, and grids. Other plugins
+can embed this using the \code{embed}-tag in their \proglang{XML} file (the plugin supports
+hiding irrelevant options). The generated code portions can be fetched from the
+\proglang{ECMAScript} file just like any other GUI settings, and inserted into the complete
+code. Other examples of embedded plugins are options for histograms, barplots,
+and ECDF plots (which in turn embed the generic plot options plugin).
+
+\subsubsection[technical_plugins_consistency]{Enforcing a consistent interface}
+RKWard tries to make it easy to create a consistent interface in all plugins.
+GUI-wise this is supported by providing high-level GUI elements, and embeddable
+clients. Also, the standard-elements of each dialog (``Submit'', and
+``Cancel'' buttons, on-the-fly code view, etc.) are hard coded. Up to version
+0.5.3 of RKWard it was not possible to use any GUI elements in plugins which
+were not explicitly defined for this purpose. In the current development
+version, theoretically, all GUI elements available from \proglang{Qt} can be inserted,
+where necessary.
+
+For generating output, the function \code{rk.header()} can be used to print a
+standardized caption for each piece of output. Printing results in vector or
+tabular form is facilitated by \code{rk.results()}. A wide range of objects can be
+printed using \code{rk.print()}, which is just a thin wrapper around the
+\code{HTML()}-function of the \pkg{R2HTML}-package \citep{Lecoutre2003} in the current
+implementation. The use of custom formatting with \proglang{HTML} is possible, but
+discouraged. Standard elements such as a horizontal separator, and the run-again
+link (see Section~\ref{usage_output}) are inserted automatically, without the need to define
+them for each plugin.
+
+Regarding the style of the generated \proglang{R} code, enforcing consistency is harder,
+but plugins which are to become part of the official RKWard distribution are
+reviewed for adherence to some guidelines. Perhaps the most important guidelines
+are
+
+\begin{itemize}
+ \item
+ Write readable code, which is properly indented, and commented where necessary.
+
+ \item
+ Do not hide any relevant computations from the user by performing them in the
+ \proglang{ECMAScript}. Rather, generate \proglang{R} code which will perform
+ those computations, transparently.
+\end{itemize}
+
+\subsubsection[technical_plugins_dependencies]{Handling of \proglang{R} package dependencies}
+A wide range of plugins for diverse functionality is present in RKWard,
+including plots (e.g. boxplot) or standard tests (e.g. Student's t-Test)\footnote{
+ At the time of this writing, there are 164 user-accessible plugins in RKWard.
+ Listing all is beyond the scope of this article.
+}. Some
+of the plugins depend on \proglang{R} packages other than the recommended \proglang{R} base packages.
+Examples herein are the calculation of kurtosis, skewness or the exact Wilcoxon
+test. Installation of additional packages is handled automatically by RKWard
+(see Section~\ref{usage_packages}).
+
+RKWard avoids loading all these packages pro-actively, as \pkg{Rcmdr} does. Rather,
+plugins which depend on certain package simply include an appropriate call to
+\code{require()} in the pre-processing section of the generated \proglang{R} code. The \code{require()}
+function is overloaded in RKWard, in order to bring up the package-installation
+dialog whenever needed. Packages invoked by \code{require()} remain loaded unless
+RKWard is terminated or a certain package is manually unloaded (\code{detach()}).
+
+Dependencies between (embedded) plugins are handled using the \code{<require>}-tag in the plugin map.
+
+\subsection[technical_processes]{Development process}
+\subsubsection[technical_processes_plugins]{RKWard core and external plugins}
+Newly developed plugins are placed in a dedicated plugin map called
+under\_development.pluginmap. Plugins in this map are not visible to the user by
+default, but need to be enabled manually. Once the author(s) of a plugin
+announces that they consider it stable, the plugin is subjected to a review for
+correctness, style, and usability. The review status is tracked in the project
+wiki. Currently at least one positive review is needed before the plugin is
+allowed to be made visible by default, by moving it to an appropriate plugin
+map.
+
+The current development version adds support for downloading additional sets of
+plugins from the Internet, which are not officially included or supported by the
+RKWard developers.
+
+\subsubsection[technical_processes_automatedtesting]{Automated testing}
+A second requirement for new plugins is that each plugin must be accompanied by
+at least one automated test. The automated testing framework in RKWard consists
+of a set of \proglang{R} scripts which allow to run a plugin with specific GUI settings,
+automatically\footnote{
+ In the current development version, the scripts have been converted into a proper
+ \proglang{R} package.
+}. The resulting \proglang{R} code, \proglang{R} messages, and output are then compared
+to a defined standard. Automated tests are run routinely after changes in the
+plugin infrastructure, and before any new release.
+
+The automated testing framework is also useful in testing some aspects of the
+application which are not implemented as plugins, but this is currently limited
+to very few basic tests.
+
+\subsection[technical_internationalization]{Internationalization}
+Currently strings in the main application are translated to varying extents in
+Czech (cs), Catalan (ca), Spanish (es), German (de), Chinese (zh\_CN), Turkish
+(tr), Polish (pl), Italian (it), French (fr), Greek (el), and Danish (da).
+Translatable strings are to be found under po/**.po in the sources. These files
+can be conveniently by edited with front-ends like Lokalize
+(\url{http://i18n.kde.org/tools/}).
+
+Plugins and help pages in RKWard are not translatable at the time of this
+writing. While it will be technically to include the respective strings in
+message catalogs, this is not currently implemented in RKWard. Similarly, any
+output generated by \proglang{R} functions defined for RKWard is not currently
+translatable. Again, however, there is no technical barrier with respect to
+internationalizing of \proglang{R} code, as discussed by \cite{Ripley2005a},
+and it is planned to make RKWard fully translatable in future versions.
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
More information about the rkward-tracker
mailing list