[rkward-cvs] SF.net SVN: rkward:[4408] trunk/rkward/packages/XiMpLe/inst

m-eik at users.sourceforge.net m-eik at users.sourceforge.net
Sat Nov 3 01:16:26 UTC 2012


Revision: 4408
          http://rkward.svn.sourceforge.net/rkward/?rev=4408&view=rev
Author:   m-eik
Date:     2012-11-03 01:16:26 +0000 (Sat, 03 Nov 2012)
Log Message:
-----------
XiMpLe: starting a vignette

Added Paths:
-----------
    trunk/rkward/packages/XiMpLe/inst/doc/
    trunk/rkward/packages/XiMpLe/inst/doc/XiMpLe_vignette.Rnw

Added: trunk/rkward/packages/XiMpLe/inst/doc/XiMpLe_vignette.Rnw
===================================================================
--- trunk/rkward/packages/XiMpLe/inst/doc/XiMpLe_vignette.Rnw	                        (rev 0)
+++ trunk/rkward/packages/XiMpLe/inst/doc/XiMpLe_vignette.Rnw	2012-11-03 01:16:26 UTC (rev 4408)
@@ -0,0 +1,167 @@
+\documentclass[a4paper,10pt]{scrartcl}
+\usepackage[utf8x]{inputenc}
+\usepackage{apacite}
+
+\newcommand{\X}[0]{\texttt{XiMpLe}}
+
+%opening
+\title{The XiMpLe Package}
+%\VignetteIndexEntry{Managing R Packages with roxyPackage}
+\author{m.eik michalke}
+
+\begin{document}
+\maketitle
+
+\begin{abstract}
+This package provides basic tools for parsing and generating XML into and from R. It is not as feature-rich as alternative packages, but it's small and keeps dependencies to a minimum.
+\end{abstract}
+
+\section{Previously on \X{}}
+
+Before I even begin, I would like to stress that \X{} can \textit{not} replace the \texttt{XML} package, and it is not supposed to. It has only a hand full of functions, therefore it can only do so much. Probably the most noteworthy missing feature in this package is any real DTD support. If you need that, you can stop reading here. Another problem is speed -- \X{} is written in pure R, and it's painfully slow with large XML trees. You won't notice this if you're only dealing with portions of some kilobytes, but if you need to parse really huge documents, it can take ages to finish.
+
+Historically, this package was written for exactly one purpose: I wanted to be able to read and write the XML documents of \texttt{RKWard}\footnote{\url{http://rkward.sourceforge.net}}, because I was about to write an R package for scripting plugins for this R GUI. I actually had started another project soon before, using the \texttt{XML} package as a dependency, but soon got complaints from Windows users. As it turned out, that package was not available for Windows, because somehow it couldn't be build automatically. I realised that I only needed a small subset of its features anyway, so I figured it might be the easiest way to quickly implement those features myself. Instead of hiding them in the internals of what eventually became the \texttt{rkwarddev} package, I then started working on this package first. And well, ''quicly'' was rather optimistic... but since I'm happily using \X{} in other packages as well (like \texttt{roxyPackage}), I'm satisfied it was worth it.
+
+So now you know. If you need a full-featured package to parse or generate XML in R, try the \texttt{XML} package. Otherwise, keep on reading.
+
+\section{And now the continuation}
+
+Basically, \X{} can do these things for you:
+
+\begin{itemize}
+  \item parse XML from files into an R object, using the \texttt{parseXML()} function
+  \item generate XML R objects, using the functions \texttt{XMLNode()} and \texttt{XMLTree()}
+  \item extract nodes from XML R objects, or change their content, using the \texttt{node()} function
+  \item write back XML files from R objects, using the \texttt{pasteXML()} function
+\end{itemize}
+
+That about covers it. XML nodes can of course be nested to construct complex trees, but that's all. Let's look at some examples.
+
+\section{Naming conventions}
+Let's quickly explain what we'll be talking about here. If you're parsing an XML document, it will contain an \textbf{XML tree}. This tree is made up of \textbf{XML nodes}. A node is indicated by pointed brackets, \textit{must} have a \textbf{name}, \textit{can} have \textbf{attributes}, and is either \textbf{empty} or not. Nodes can be nested, where nodes inside a node are its \textbf{child nodes}.
+
+\begin{Schunk}
+	\begin{Sinput}
+<!-- following is an empty node named "useless" -->
+<useless />
+
+<!-- the next node is non-empty and has an attribute foo with value bar -->
+<other foo="bar">
+  this text is the child of the "other" node.
+</other>
+	\end{Sinput}
+\end{Schunk}
+
+\section{Generate XML trees}
+
+Now let's see how these nodes can be generated using the \X{} package. Single nodes are the domain of the \texttt{XMLNode()} function, and to get an empty node you just give it the name of that node:
+
+\begin{Schunk}
+	\begin{Sinput}
+> XMLNode("useless")
+	\end{Sinput}
+	\begin{Soutput}
+<useless />
+	\end{Soutput}
+\end{Schunk}
+
+As you see, you will see XML code in the console. But what this function returns is actually an R object of class \texttt{XiMpLe.node}, so what you see is an interpretation of that object, done by the \texttt{show()} method for objects of this type. If you would like to write the XML code to a file, you need to call \texttt{pasteXML()}, which will return a character string:
+
+\begin{Schunk}
+	\begin{Sinput}
+> useless.node <- XMLNode("useless")
+> pasteXML(useless.node)
+	\end{Sinput}
+	\begin{Soutput}
+[1] "<useless />\n"
+	\end{Soutput}
+\end{Schunk}
+
+The second in the example above node has an attribute. Attributes can be specified with the \texttt{attrs} argument, which expects a named list:
+
+\begin{Schunk}
+	\begin{Sinput}
+> XMLNode("other", attrs=list(foo="bar"))
+	\end{Sinput}
+	\begin{Soutput}
+<other foo="bar" />
+	\end{Soutput}
+\end{Schunk}
+
+So, by default, as long as our node doesn't have any children, it's assumed to be an empty node. To force it into a non-empty node (i.\,e., opening and closing tag) even without content, we'd have to provide an empty character string as its child. Child nodes can be provided in two ways -- either one by one via the ''$\dots$'' argument, or as one list via the \texttt{.children} argument:
+
+\begin{Schunk}
+	\begin{Sinput}
+> XMLNode("other", "", attrs=list(foo="bar"))
+	\end{Sinput}
+	\begin{Soutput}
+<other foo="bar">
+</other>
+	\end{Soutput}
+	\begin{Sinput}
+> XMLNode("other", attrs=list(foo="bar"), .children=list(""))
+	\end{Sinput}
+	\begin{Soutput}
+<other foo="bar">
+</other>
+	\end{Soutput}
+\end{Schunk}
+
+Of course this is also the place to provide our node with the text value:
+
+\begin{Schunk}
+	\begin{Sinput}
+> XMLNode("other", "this text is the child of the \"other\" node.",
++ attrs=list(foo="bar"))
+	\end{Sinput}
+	\begin{Soutput}
+<other foo="bar" />
+	\end{Soutput}
+\end{Schunk}
+
+But how about the comments? Well, \X{} does detect some special node names, one being ''!--'' to indicate a comment:
+
+\begin{Schunk}
+	\begin{Sinput}
+> XMLNode("!--", "following is an empty node named \"useless\"")
+	\end{Sinput}
+	\begin{Soutput}
+<!-- following is an empty node named "useless" -->
+	\end{Soutput}
+\end{Schunk}
+
+OK, that's single nodes. In most cases, you want to have nested nodes which combine into an XML tree. As a practical example, this is how you could generate an XHTML document:
+
+\begin{Schunk}
+	\begin{Sinput}
+> sample.XML.a <- XMLNode("a", "klick here!",
++ attrs=list(href="http://example.com", target="_blank"))
+> sample.XML.body <- XMLNode("body", sample.XML.a)
+> sample.XML.html <- XMLNode("html", XMLNode("head", ""),
++ sample.XML.body)
+> sample.XML.tree <- XMLTree(sample.XML.html,
++ xml=list(version="1.0", encoding="UTF-8"),
++ dtd=list(doctype="html PUBLIC",
++ id="-//W3C//DTD XHTML 1.0 Transitional//EN",
++ refer="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"))
+> sample.XML.tree
+	\end{Sinput}
+	\begin{Soutput}
+<?xml version="1.0" encoding="UTF-8" ?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+<html>
+  <head>
+  </head>
+  <body>
+    <a href="http://example.com" target="_blank">
+      klick here!
+    </a>
+  </body>
+</html>
+	\end{Soutput}
+\end{Schunk}
+
+It should be noted, however, that \X{} doesn't perform even the slightest checks on what you provide as \texttt{DOCTYPE} or \texttt{xml} attributes.
+
+\end{document}

This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.





More information about the rkward-tracker mailing list