blob: 5a640fbdc07e8287a2437da8717af0d13811f550 [file] [log] [blame]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Extending SWIG to support new languages</title>
<link rel="stylesheet" type="text/css" href="style.css">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body bgcolor="#ffffff">
<H1><a name="Extending">39 Extending SWIG to support new languages</a></H1>
<!-- INDEX -->
<div class="sectiontoc">
<ul>
<li><a href="#Extending_nn2">Introduction</a>
<li><a href="#Extending_nn3">Prerequisites</a>
<li><a href="#Extending_nn4">The Big Picture</a>
<li><a href="#Extending_nn5">Execution Model</a>
<ul>
<li><a href="#Extending_nn6">Preprocessing</a>
<li><a href="#Extending_nn7">Parsing</a>
<li><a href="#Extending_nn8">Parse Trees</a>
<li><a href="#Extending_nn9">Attribute namespaces</a>
<li><a href="#Extending_nn10">Symbol Tables</a>
<li><a href="#Extending_nn11">The %feature directive</a>
<li><a href="#Extending_nn12">Code Generation</a>
<li><a href="#Extending_nn13">SWIG and XML</a>
</ul>
<li><a href="#Extending_nn14">Primitive Data Structures</a>
<ul>
<li><a href="#Extending_nn15">Strings</a>
<li><a href="#Extending_nn16">Hashes</a>
<li><a href="#Extending_nn17">Lists</a>
<li><a href="#Extending_nn18">Common operations</a>
<li><a href="#Extending_nn19">Iterating over Lists and Hashes</a>
<li><a href="#Extending_nn20">I/O</a>
</ul>
<li><a href="#Extending_nn21">Navigating and manipulating parse trees</a>
<li><a href="#Extending_nn22">Working with attributes</a>
<li><a href="#Extending_nn23">Type system</a>
<ul>
<li><a href="#Extending_nn24">String encoding of types</a>
<li><a href="#Extending_nn25">Type construction</a>
<li><a href="#Extending_nn26">Type tests</a>
<li><a href="#Extending_nn27">Typedef and inheritance</a>
<li><a href="#Extending_nn28">Lvalues</a>
<li><a href="#Extending_nn29">Output functions</a>
</ul>
<li><a href="#Extending_nn30">Parameters</a>
<li><a href="#Extending_nn31">Writing a Language Module</a>
<ul>
<li><a href="#Extending_nn32">Execution model</a>
<li><a href="#Extending_starting_out">Starting out</a>
<li><a href="#Extending_nn34">Command line options</a>
<li><a href="#Extending_nn35">Configuration and preprocessing</a>
<li><a href="#Extending_nn36">Entry point to code generation</a>
<li><a href="#Extending_nn37">Module I/O and wrapper skeleton</a>
<li><a href="#Extending_nn38">Low-level code generators</a>
<li><a href="#Extending_configuration_files">Configuration files</a>
<li><a href="#Extending_nn40">Runtime support</a>
<li><a href="#Extending_nn41">Standard library files</a>
<li><a href="#Extending_nn42">User examples</a>
<li><a href="#Extending_test_suite">Test driven development and the test-suite</a>
<ul>
<li><a href="#Extending_running_test_suite">Running the test-suite</a>
</ul>
<li><a href="#Extending_nn43">Documentation</a>
<li><a href="#Extending_coding_style_guidelines">Coding style guidelines</a>
<li><a href="#Extending_language_status">Target language status</a>
<ul>
<li><a href="#Extending_supported_status">Supported status</a>
<li><a href="#Extending_experimental_status">Experimental status</a>
</ul>
<li><a href="#Extending_prerequisites">Prerequisites for adding a new language module to the SWIG distribution</a>
</ul>
<li><a href="#Extending_debugging_options">Debugging Options</a>
<li><a href="#Extending_nn46">Guide to parse tree nodes</a>
<li><a href="#Extending_further_info">Further Development Information</a>
</ul>
</div>
<!-- INDEX -->
<H2><a name="Extending_nn2">39.1 Introduction</a></H2>
<p>
This chapter describes SWIG's internal organization and the process by which
new target languages can be developed. First, a brief word of warning---SWIG
is continually evolving.
The information in this chapter is mostly up to
date, but changes are ongoing. Expect a few inconsistencies.
</p>
<p>
Also, this chapter is not meant to be a hand-holding tutorial. As a starting point,
you should probably look at one of SWIG's existing modules.
</p>
<H2><a name="Extending_nn3">39.2 Prerequisites</a></H2>
<p>
In order to extend SWIG, it is useful to have the following background:
</p>
<ul>
<li>An understanding of the C API for the target language.
<li>A good grasp of the C++ type system.
<li>An understanding of typemaps and some of SWIG's advanced features.
<li>Some familiarity with writing C++ (language modules are currently written in C++).
</ul>
<p>
Since SWIG is essentially a specialized C++ compiler, it may be useful
to have some prior experience with compiler design (perhaps even a
compilers course) to better understand certain parts of the system. A
number of books will also be useful. For example, "The C Programming
Language" by Kernighan and Ritchie (a.k.a, "K&amp;R") and the C++ standard,
"ISO/IEC 14882 Programming Languages - C++" will be of great use.
</p>
<p>
Also, it is useful to keep in mind that SWIG primarily operates as an
extension of the C++ <em>type</em> system. At first glance, this might not be
obvious, but almost all SWIG directives as well as the low-level generation of
wrapper code are driven by C++ datatypes.
</p>
<H2><a name="Extending_nn4">39.3 The Big Picture</a></H2>
<p>
SWIG is a special purpose compiler that parses C++ declarations to
generate wrapper code. To make this conversion possible, SWIG makes
three fundamental extensions to the C++ language:
</p>
<ul>
<li><b>Typemaps</b>. Typemaps are used to define the
conversion/marshalling behavior of specific C++ datatypes. All type conversion in SWIG is
based on typemaps. Furthermore, the association of typemaps to datatypes utilizes an advanced pattern matching
mechanism that is fully integrated with the C++ type system.
</li>
<li><b>Declaration Annotation</b>. To customize wrapper code
generation, most declarations can be annotated with special features.
For example, you can make a variable read-only, you can ignore a
declaration, you can rename a member function, you can add exception
handling, and so forth. Virtually all of these customizations are built on top of a low-level
declaration annotator that can attach arbitrary attributes to any declaration.
Code generation modules can look for these attributes to guide the wrapping process.
</li>
<li><b>Class extension</b>. SWIG allows classes and structures to be extended with new
methods and attributes (the <tt>%extend</tt> directive). This has the effect of altering
the API in the target language and can be used to generate OO interfaces to C libraries.
</ul>
<p>
It is important to emphasize that virtually all SWIG features reduce to one of these three
fundamental concepts. The type system and pattern matching rules also play a critical
role in making the system work. For example, both typemaps and declaration annotation are
based on pattern matching and interact heavily with the underlying type system.
</p>
<H2><a name="Extending_nn5">39.4 Execution Model</a></H2>
<p>
When you run SWIG on an interface, processing is handled in stages by a series of system components:
</p>
<ul>
<li>An integrated C preprocessor reads a collection of configuration
files and the specified interface file into memory. The preprocessor
performs the usual functions including macro expansion and file
inclusion. However, the preprocessor also performs some transformations of the
interface. For instance, <tt>#define</tt> statements are sometimes transformed into
<tt>%constant</tt> declarations. In addition, information related to file/line number
tracking is inserted.
</li>
<li>A C/C++ parser reads the preprocessed input and generates a full
parse tree of all of the SWIG directives and C declarations found.
The parser is responsible for many aspects of the system including
renaming, declaration annotation, and template expansion. However, the parser
does not produce any output nor does it interact with the target
language module as it runs. SWIG is not a one-pass compiler.
</li>
<li>A type-checking pass is made. This adjusts all of the C++ typenames to properly
handle namespaces, typedefs, nested classes, and other issues related to type scoping.
</li>
<li>A semantic pass is made on the parse tree to collect information
related to properties of the C++ interface. For example, this pass
would determine whether or not a class allows a default constructor.
</li>
<li>A code generation pass is made using a specific target language
module. This phase is responsible for generating the actual wrapper
code. All of SWIG's user-defined modules are invoked during this
latter stage of compilation.
</li>
</ul>
<p>
The next few sections briefly describe some of these stages.
</p>
<H3><a name="Extending_nn6">39.4.1 Preprocessing</a></H3>
<p>
The preprocessor plays a critical role in the SWIG implementation. This is because a lot
of SWIG's processing and internal configuration is managed not by code written in C, but
by configuration files in the SWIG library. In fact, when you
run SWIG, parsing starts with a small interface file like this (note: this explains
the cryptic error messages that new users sometimes get when SWIG is misconfigured or installed
incorrectly):
</p>
<div class="code">
<pre>
%include "swig.swg" // Global SWIG configuration
%include "<em>langconfig.swg</em>" // Language specific configuration
%include "yourinterface.i" // Your interface file
</pre>
</div>
<p>
The <tt>swig.swg</tt> file contains global configuration information. In addition, this file
defines many of SWIG's standard directives as macros. For instance, part of
of <tt>swig.swg</tt> looks like this:
</p>
<div class="code">
<pre>
...
/* Code insertion directives such as %wrapper %{ ... %} */
#define %begin %insert("begin")
#define %runtime %insert("runtime")
#define %header %insert("header")
#define %wrapper %insert("wrapper")
#define %init %insert("init")
/* Access control directives */
#define %immutable %feature("immutable", "1")
#define %mutable %feature("immutable")
/* Directives for callback functions */
#define %callback(x) %feature("callback") `x`;
#define %nocallback %feature("callback");
/* %ignore directive */
#define %ignore %rename($ignore)
#define %ignorewarn(x) %rename("$ignore:" x)
...
</pre>
</div>
<p>
The fact that most of the standard SWIG directives are macros is
intended to simplify the implementation of the internals. For instance,
rather than having to support dozens of special directives, it is
easier to have a few basic primitives such as <tt>%feature</tt> or
<tt>%insert</tt>.
</p>
<p>
The <em><tt>langconfig.swg</tt></em> file is supplied by the target
language. This file contains language-specific configuration
information. More often than not, this file provides run-time wrapper
support code (e.g., the type-checker) as well as a collection of
typemaps that define the default wrapping behavior. Note: the name of this
file depends on the target language and is usually something like <tt>python.swg</tt>
or <tt>perl5.swg</tt>.
</p>
<p>
As a debugging aid, the text that SWIG feeds to its C++ parser can be
obtained by running <tt>swig -E interface.i</tt>. This output
probably isn't too useful in general, but it will show how macros have
been expanded as well as everything else that goes into the low-level
construction of the wrapper code.
</p>
<H3><a name="Extending_nn7">39.4.2 Parsing</a></H3>
<p>
The current C++ parser handles a subset of C++. Most incompatibilities with C are due to
subtle aspects of how SWIG parses declarations. Specifically, SWIG expects all C/C++ declarations to follow this general form:
</p>
<div class="diagram">
<pre>
<em>storage</em> <em>type</em> <em>declarator</em> <em>initializer</em>;
</pre>
</div>
<p>
<tt><em>storage</em></tt> is a keyword such as <tt>extern</tt>,
<tt>static</tt>, <tt>typedef</tt>, or <tt>virtual</tt>. <tt><em>type</em></tt> is a primitive
datatype such as <tt>int</tt> or <tt>void</tt>. <tt><em>type</em></tt> may be optionally
qualified with a qualifier such as <tt>const</tt> or <tt>volatile</tt>. <tt><em>declarator</em></tt>
is a name with additional type-construction modifiers attached to it (pointers, arrays, references,
functions, etc.). Examples of declarators include <tt>*x</tt>, <tt>**x</tt>, <tt>x[20]</tt>, and
<tt>(*x)(int, double)</tt>. The <tt><em>initializer</em></tt> may be a value assigned using <tt>=</tt> or
body of code enclosed in braces <tt>{ ... }</tt>.
</p>
<p>
This declaration format covers most common C++ declarations. However, the C++ standard
is somewhat more flexible in the placement of the parts. For example, it is technically legal, although
uncommon to write something like <tt>int typedef const a</tt> in your program. SWIG simply
doesn't bother to deal with this case.
</p>
<p>
The other significant difference between C++ and SWIG is in the
treatment of typenames. In C++, if you have a declaration like this,
</p>
<div class="code">
<pre>
int blah(Foo *x, Bar *y);
</pre>
</div>
<p>
it won't parse correctly unless <tt>Foo</tt> and <tt>Bar</tt> have
been previously defined as types either using a <tt>class</tt>
definition or a <tt>typedef</tt>. The reasons for this are subtle,
but this treatment of typenames is normally integrated at the level of the C
tokenizer---when a typename appears, a different token is returned to the parser
instead of an identifier.
</p>
<p>
SWIG does not operate in this manner--any legal identifier can be used
as a type name. The reason for this is primarily motivated by the use
of SWIG with partially defined data. Specifically,
SWIG is supposed to be easy to use on interfaces with missing type information.
</p>
<p>
Because of the different treatment of typenames, the most serious
limitation of the SWIG parser is that it can't process type declarations where
an extra (and unnecessary) grouping operator is used. For example:
</p>
<div class="code">
<pre>
int (x); /* A variable x */
int (y)(int); /* A function y */
</pre>
</div>
<p>
The placing of extra parentheses in type declarations like this is
already recognized by the C++ community as a potential source of
strange programming errors. For example, Scott Meyers "Effective STL"
discusses this problem in a section on avoiding C++'s "most vexing
parse."
</p>
<p>
The parser is also unable to handle declarations with no return type or bare argument names.
For example, in an old C program, you might see things like this:
</p>
<div class="code">
<pre>
foo(a, b) {
...
}
</pre>
</div>
<p>
In this case, the return type as well as the types of the arguments
are taken by the C compiler to be an <tt>int</tt>. However, SWIG
interprets the above code as an abstract declarator for a function
returning a <tt>foo</tt> and taking types <tt>a</tt> and <tt>b</tt> as
arguments).
</p>
<H3><a name="Extending_nn8">39.4.3 Parse Trees</a></H3>
<p>
The SWIG parser produces a complete parse tree of the input file before any wrapper code
is actually generated. Each item in the tree is known as a "Node". Each node is identified
by a symbolic tag. Furthermore, a node may have an arbitrary number of children.
The parse tree structure and tag names of an interface can be displayed using <tt>swig -debug-tags</tt>.
For example:
</p>
<div class="shell">
<pre>
$ <b>swig -c++ -python -debug-tags example.i</b>
. top (example.i:1)
. top . include (example.i:1)
. top . include . typemap (/r0/beazley/Projects/lib/swig1.3/swig.swg:71)
. top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/swig.swg:71)
. top . include . typemap (/r0/beazley/Projects/lib/swig1.3/swig.swg:83)
. top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/swig.swg:83)
. top . include (example.i:4)
. top . include . insert (/r0/beazley/Projects/lib/swig1.3/python/python.swg:7)
. top . include . insert (/r0/beazley/Projects/lib/swig1.3/python/python.swg:8)
. top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:19)
...
. top . include (example.i:6)
. top . include . module (example.i:2)
. top . include . insert (example.i:6)
. top . include . include (example.i:9)
. top . include . include . class (example.h:3)
. top . include . include . class . access (example.h:4)
. top . include . include . class . constructor (example.h:7)
. top . include . include . class . destructor (example.h:10)
. top . include . include . class . cdecl (example.h:11)
. top . include . include . class . cdecl (example.h:11)
. top . include . include . class . cdecl (example.h:12)
. top . include . include . class . cdecl (example.h:13)
. top . include . include . class . cdecl (example.h:14)
. top . include . include . class . cdecl (example.h:15)
. top . include . include . class (example.h:18)
. top . include . include . class . access (example.h:19)
. top . include . include . class . cdecl (example.h:20)
. top . include . include . class . access (example.h:21)
. top . include . include . class . constructor (example.h:22)
. top . include . include . class . cdecl (example.h:23)
. top . include . include . class . cdecl (example.h:24)
. top . include . include . class (example.h:27)
. top . include . include . class . access (example.h:28)
. top . include . include . class . cdecl (example.h:29)
. top . include . include . class . access (example.h:30)
. top . include . include . class . constructor (example.h:31)
. top . include . include . class . cdecl (example.h:32)
. top . include . include . class . cdecl (example.h:33)
</pre>
</div>
<p>
Even for the most simple interface, the parse tree structure is larger than you might expect. For example, in the
above output, a substantial number of nodes are actually generated by the <tt>python.swg</tt> configuration file
which defines typemaps and other directives. The contents of the user-supplied input file don't appear until the end
of the output.
</p>
<p>
The contents of each parse tree node consist of a collection of attribute/value
pairs. Internally, the nodes are simply represented by hash tables. A display of
the entire parse-tree structure can be obtained using <tt>swig -debug-top &lt;n&gt;</tt>, where <tt>n</tt> is
the stage being processed.
There are a number of other parse tree display options, for example, <tt>swig -debug-module &lt;n&gt;</tt> will
avoid displaying system parse information and only display the parse tree pertaining to the user's module at
stage <tt>n</tt> of processing.
</p>
<div class="shell">
<pre>
$ swig -c++ -python -debug-module 4 example.i
+++ include ----------------------------------------
| name - "example.i"
+++ module ----------------------------------------
| name - "example"
|
+++ insert ----------------------------------------
| code - "\n#include \"example.h\"\n"
|
+++ include ----------------------------------------
| name - "example.h"
+++ class ----------------------------------------
| abstract - "1"
| sym:name - "Shape"
| name - "Shape"
| kind - "class"
| symtab - 0x40194140
| sym:symtab - 0x40191078
+++ access ----------------------------------------
| kind - "public"
|
+++ constructor ----------------------------------------
| sym:name - "Shape"
| name - "Shape"
| decl - "f()."
| code - "{\n nshapes++;\n }"
| sym:symtab - 0x40194140
|
+++ destructor ----------------------------------------
| sym:name - "~Shape"
| name - "~Shape"
| storage - "virtual"
| code - "{\n nshapes--;\n }"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "x"
| name - "x"
| decl - ""
| type - "double"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "y"
| name - "y"
| decl - ""
| type - "double"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "move"
| name - "move"
| decl - "f(double, double)."
| parms - double, double
| type - "void"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "area"
| name - "area"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| value - "0"
| type - "double"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "perimeter"
| name - "perimeter"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| value - "0"
| type - "double"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "nshapes"
| name - "nshapes"
| decl - ""
| storage - "static"
| type - "int"
| sym:symtab - 0x40194140
|
+++ class ----------------------------------------
| sym:name - "Circle"
| name - "Circle"
| kind - "class"
| bases - 0x40194510
| symtab - 0x40194538
| sym:symtab - 0x40191078
+++ access ----------------------------------------
| kind - "private"
|
+++ cdecl ----------------------------------------
| name - "radius"
| decl - ""
| type - "double"
|
+++ access ----------------------------------------
| kind - "public"
|
+++ constructor ----------------------------------------
| sym:name - "Circle"
| name - "Circle"
| parms - double
| decl - "f(double)."
| code - "{ }"
| sym:symtab - 0x40194538
|
+++ cdecl ----------------------------------------
| sym:name - "area"
| name - "area"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| type - "double"
| sym:symtab - 0x40194538
|
+++ cdecl ----------------------------------------
| sym:name - "perimeter"
| name - "perimeter"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| type - "double"
| sym:symtab - 0x40194538
|
+++ class ----------------------------------------
| sym:name - "Square"
| name - "Square"
| kind - "class"
| bases - 0x40194760
| symtab - 0x40194788
| sym:symtab - 0x40191078
+++ access ----------------------------------------
| kind - "private"
|
+++ cdecl ----------------------------------------
| name - "width"
| decl - ""
| type - "double"
|
+++ access ----------------------------------------
| kind - "public"
|
+++ constructor ----------------------------------------
| sym:name - "Square"
| name - "Square"
| parms - double
| decl - "f(double)."
| code - "{ }"
| sym:symtab - 0x40194788
|
+++ cdecl ----------------------------------------
| sym:name - "area"
| name - "area"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| type - "double"
| sym:symtab - 0x40194788
|
+++ cdecl ----------------------------------------
| sym:name - "perimeter"
| name - "perimeter"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| type - "double"
| sym:symtab - 0x40194788
</pre>
</div>
<H3><a name="Extending_nn9">39.4.4 Attribute namespaces</a></H3>
<p>
Attributes of parse tree nodes are often prepended with a namespace qualifier.
For example, the attributes
<tt>sym:name</tt> and <tt>sym:symtab</tt> are attributes related to
symbol table management and are prefixed with <tt>sym:</tt>. As a
general rule, only those attributes which are directly related to the raw declaration
appear without a prefix (type, name, declarator, etc.).
</p>
<p>
Target language modules may add additional attributes to nodes to assist the generation
of wrapper code. The convention for doing this is to place these attributes in a namespace
that matches the name of the target language. For example, <tt>python:foo</tt> or
<tt>perl:foo</tt>.
</p>
<H3><a name="Extending_nn10">39.4.5 Symbol Tables</a></H3>
<p>
During parsing, all symbols are managed in the space of the target
language. The <tt>sym:name</tt> attribute of each node contains the symbol name
selected by the parser. Normally, <tt>sym:name</tt> and <tt>name</tt>
are the same. However, the <tt>%rename</tt> directive can be used to
change the value of <tt>sym:name</tt>. You can see the effect of
<tt>%rename</tt> by trying it on a simple interface and dumping the
parse tree. For example:
</p>
<div class="code">
<pre>
%rename(foo_i) foo(int);
%rename(foo_d) foo(double);
void foo(int);
void foo(double);
void foo(Bar *b);
</pre>
</div>
<p>
There are various <tt>debug-</tt> options that can be useful for debugging and analysing the parse tree.
For example, the <tt>debug-top &lt;n&gt;</tt> or <tt>debug-module &lt;n&gt;</tt> options will
dump the entire/top of the parse tree or the module subtree at one of the four <tt>n</tt> stages of processing.
The parse tree can be viewed after the final stage of processing by running SWIG:
</p>
<div class="shell">
<pre>
$ swig -debug-top 4 example.i
...
+++ cdecl ----------------------------------------
| sym:name - "foo_i"
| name - "foo"
| decl - "f(int)."
| parms - int
| type - "void"
| sym:symtab - 0x40165078
|
+++ cdecl ----------------------------------------
| sym:name - "foo_d"
| name - "foo"
| decl - "f(double)."
| parms - double
| type - "void"
| sym:symtab - 0x40165078
|
+++ cdecl ----------------------------------------
| sym:name - "foo"
| name - "foo"
| decl - "f(p.Bar)."
| parms - Bar *
| type - "void"
| sym:symtab - 0x40165078
</pre>
</div>
<p>
All symbol-related conflicts and complaints about overloading are based on <tt>sym:name</tt> values.
For instance, the following example uses <tt>%rename</tt> in reverse to generate a name clash.
</p>
<div class="code">
<pre>
%rename(foo) foo_i(int);
%rename(foo) foo_d(double);
void foo_i(int);
void foo_d(double);
void foo(Bar *b);
</pre>
</div>
<p>
When you run SWIG on this you now get:
</p>
<div class="shell">
<pre>
$ ./swig example.i
example.i:6. Overloaded declaration ignored. foo_d(double )
example.i:5. Previous declaration is foo_i(int )
example.i:7. Overloaded declaration ignored. foo(Bar *)
example.i:5. Previous declaration is foo_i(int )
</pre>
</div>
<H3><a name="Extending_nn11">39.4.6 The %feature directive</a></H3>
<p>
A number of SWIG directives such as <tt>%exception</tt> are implemented using the
low-level <tt>%feature</tt> directive. For example:
</p>
<div class="code">
<pre>
%feature("except") getitem(int) {
try {
$action
} catch (badindex) {
...
}
}
...
class Foo {
public:
Object *getitem(int index) throws(badindex);
...
};
</pre>
</div>
<p>
The behavior of <tt>%feature</tt> is very easy to describe--it simply
attaches a new attribute to any parse tree node that matches the
given prototype. When a feature is added, it shows up as an attribute in the <tt>feature:</tt> namespace.
You can see this when running with the <tt>-debug-top 4</tt> option. For example:
</p>
<div class="shell">
<pre>
+++ cdecl ----------------------------------------
| sym:name - "getitem"
| name - "getitem"
| decl - "f(int).p."
| parms - int
| type - "Object"
| feature:except - "{\n try {\n $action\n } catc..."
| sym:symtab - 0x40168ac8
|
</pre>
</div>
<p>
Feature names are completely arbitrary and a target language module can be
programmed to respond to any feature name that it wants to recognize. The
data stored in a feature attribute is usually just a raw unparsed string.
For example, the exception code above is simply
stored without any modifications.
</p>
<H3><a name="Extending_nn12">39.4.7 Code Generation</a></H3>
<p>
Language modules work by defining handler functions that know how to respond to
different types of parse-tree nodes. These handlers simply look at the
attributes of each node in order to produce low-level code.
</p>
<p>
In reality, the generation of code is somewhat more subtle than simply
invoking handler functions. This is because parse-tree nodes might be
transformed. For example, suppose you are wrapping a class like this:
</p>
<div class="code">
<pre>
class Foo {
public:
virtual int *bar(int x);
};
</pre>
</div>
<p>
When the parser constructs a node for the member <tt>bar</tt>, it creates a raw "cdecl" node with the following
attributes:
</p>
<div class="diagram">
<pre>
nodeType : cdecl
name : bar
type : int
decl : f(int).p
parms : int x
storage : virtual
sym:name : bar
</pre>
</div>
<p>
To produce wrapper code, this "cdecl" node undergoes a number of transformations. First, the node is recognized as a function declaration. This adjusts some of the type information--specifically, the declarator is joined with the base datatype to produce this:
</p>
<div class="diagram">
<pre>
nodeType : cdecl
name : bar
type : p.int &lt;-- Notice change in return type
decl : f(int).p
parms : int x
storage : virtual
sym:name : bar
</pre>
</div>
<p>
Next, the context of the node indicates that the node is really a
member function. This produces a transformation to a low-level
accessor function like this:
</p>
<div class="diagram">
<pre>
nodeType : cdecl
name : bar
type : int.p
decl : f(int).p
parms : Foo *self, int x &lt;-- Added parameter
storage : virtual
wrap:action : result = (arg1)-&gt;bar(arg2) &lt;-- Action code added
sym:name : Foo_bar &lt;-- Symbol name changed
</pre>
</div>
<p>
In this transformation, notice how an additional parameter was added
to the parameter list and how the symbol name of the node has suddenly
changed into an accessor using the naming scheme described in the
"SWIG Basics" chapter. A small fragment of "action" code has also
been generated--notice how the <tt>wrap:action</tt> attribute defines
the access to the underlying method. The data in this transformed
node is then used to generate a wrapper.
</p>
<p>
Language modules work by registering handler functions for dealing with
various types of nodes at different stages of transformation. This is done by
inheriting from a special <tt>Language</tt> class and defining a collection
of virtual methods. For example, the Python module defines a class as
follows:
</p>
<div class="code">
<pre>
class PYTHON : public Language {
protected:
public :
virtual void main(int, char *argv[]);
virtual int top(Node *);
virtual int functionWrapper(Node *);
virtual int constantWrapper(Node *);
virtual int variableWrapper(Node *);
virtual int nativeWrapper(Node *);
virtual int membervariableHandler(Node *);
virtual int memberconstantHandler(Node *);
virtual int memberfunctionHandler(Node *);
virtual int constructorHandler(Node *);
virtual int destructorHandler(Node *);
virtual int classHandler(Node *);
virtual int classforwardDeclaration(Node *);
virtual int insertDirective(Node *);
virtual int importDirective(Node *);
};
</pre>
</div>
<p>
The role of these functions is described shortly.
</p>
<H3><a name="Extending_nn13">39.4.8 SWIG and XML</a></H3>
<p>
Much of SWIG's current parser design was originally motivated by
interest in using XML to represent SWIG parse trees. Although XML is
not currently used in any direct manner, the parse tree structure, use
of node tags, attributes, and attribute namespaces are all influenced
by aspects of XML parsing. Therefore, in trying to understand SWIG's
internal data structures, it may be useful to keep XML in the back of
your mind as a model.
</p>
<H2><a name="Extending_nn14">39.5 Primitive Data Structures</a></H2>
<p>
Most of SWIG is constructed using three basic data structures:
strings, hashes, and lists. These data structures are dynamic in same way as
similar structures found in many scripting languages. For instance,
you can have containers (lists and hash tables) of mixed types and
certain operations are polymorphic.
</p>
<p>
This section briefly describes the basic structures so that later
sections of this chapter make more sense.
</p>
<p>
When describing the low-level API, the following type name conventions are
used:
</p>
<ul>
<li><tt>String</tt>. A string object.
<li><tt>Hash</tt>. A hash object.
<li><tt>List</tt>. A list object.
<li><tt>String_or_char</tt>. A string object or a <tt>char *</tt>.
<li><tt>Object_or_char</tt>. An object or a <tt>char *</tt>.
<li><tt>Object</tt>. Any object (string, hash, list, etc.)
</ul>
<p>
In most cases, other typenames in the source are aliases for one of these
primitive types. Specifically:
</p>
<div class="code">
<pre>
typedef String SwigType;
typedef Hash Parm;
typedef Hash ParmList;
typedef Hash Node;
typedef Hash Symtab;
typedef Hash Typetab;
</pre>
</div>
<H3><a name="Extending_nn15">39.5.1 Strings</a></H3>
<p>
<b><tt>String *NewString(const String_or_char *val)</tt></b>
</p>
<div class="indent">
Creates a new string with initial value <tt>val</tt>. <tt>val</tt> may
be a <tt>char *</tt> or another <tt>String</tt> object. If you want
to create an empty string, use "" for val.
</div>
<p>
<b><tt>String *NewStringf(const char *fmt, ...)</tt></b>
</p>
<div class="indent">
Creates a new string whose initial value is set according to a C <tt>printf</tt> style
format string in <tt>fmt</tt>. Additional arguments follow depending
on <tt>fmt</tt>.
</div>
<p>
<b><tt>String *Copy(String *s)</tt></b>
</p>
<div class="indent">
Make a copy of the string <tt>s</tt>.
</div>
<p>
<b><tt>void Delete(String *s)</tt></b>
</p>
<div class="indent">
Deletes <tt>s</tt>.
</div>
<p>
<b><tt>int Len(const String_or_char *s)</tt></b>
</p>
<div class="indent">
Returns the length of the string.
</div>
<p>
<b><tt>char *Char(const String_or_char *s)</tt></b>
</p>
<div class="indent">
Returns a pointer to the first character in a string.
</div>
<p>
<b><tt>void Append(String *s, const String_or_char *t)</tt></b>
</p>
<div class="indent">
Appends <tt>t</tt> to the end of string <tt>s</tt>.
</div>
<p>
<b><tt>void Insert(String *s, int pos, const String_or_char *t)</tt></b>
</p>
<div class="indent">
Inserts <tt>t</tt> into <tt>s</tt> at position <tt>pos</tt>. The contents
of <tt>s</tt> are shifted accordingly. The special value <tt>DOH_END</tt>
can be used for <tt>pos</tt> to indicate insertion at the end of the string (appending).
</div>
<p>
<b><tt>int Strcmp(const String_or_char *s, const String_or_char *t)</tt></b>
</p>
<div class="indent">
Compare strings <tt>s</tt> and <tt>t</tt>. Same as the C <tt>strcmp()</tt>
function.
</div>
<p>
<b><tt>int Strncmp(const String_or_char *s, const String_or_char *t, int len)</tt></b>
</p>
<div class="indent">
Compare the first <tt>len</tt> characters of strings <tt>s</tt> and <tt>t</tt>. Same as the C <tt>strncmp()</tt>
function.
</div>
<p>
<b><tt>char *Strstr(const String_or_char *s, const String_or_char *pat)</tt></b>
</p>
<div class="indent">
Returns a pointer to the first occurrence of <tt>pat</tt> in <tt>s</tt>.
Same as the C <tt>strstr()</tt> function.
</div>
<p>
<b><tt>char *Strchr(const String_or_char *s, char ch)</tt></b>
</p>
<div class="indent">
Returns a pointer to the first occurrence of character <tt>ch</tt> in <tt>s</tt>.
Same as the C <tt>strchr()</tt> function.
</div>
<p>
<b><tt>void Chop(String *s)</tt></b>
</p>
<div class="indent">
Chops trailing whitespace off the end of <tt>s</tt>.
</div>
<p>
<b><tt>int Replace(String *s, const String_or_char *pat, const String_or_char *rep, int flags)</tt></b>
</p>
<div class="indent">
<p>
Replaces the pattern <tt>pat</tt> with <tt>rep</tt> in string <tt>s</tt>.
<tt>flags</tt> is a combination of the following flags:</p>
<div class="code">
<pre>
DOH_REPLACE_ANY - Replace all occurrences
DOH_REPLACE_ID - Valid C identifiers only
DOH_REPLACE_NOQUOTE - Don't replace in quoted strings
DOH_REPLACE_FIRST - Replace first occurrence only.
</pre>
</div>
<p>
Returns the number of replacements made (if any).
</p>
</div>
<H3><a name="Extending_nn16">39.5.2 Hashes</a></H3>
<p>
<b><tt>Hash *NewHash()</tt></b>
</p>
<div class="indent">
Creates a new empty hash table.
</div>
<p>
<b><tt>Hash *Copy(Hash *h)</tt></b>
</p>
<div class="indent">
Make a shallow copy of the hash <tt>h</tt>.
</div>
<p>
<b><tt>void Delete(Hash *h)</tt></b>
</p>
<div class="indent">
Deletes <tt>h</tt>.
</div>
<p>
<b><tt>int Len(Hash *h)</tt></b>
</p>
<div class="indent">
Returns the number of items in <tt>h</tt>.
</div>
<p>
<b><tt>Object *Getattr(Hash *h, const String_or_char *key)</tt></b>
</p>
<div class="indent">
Gets an object from <tt>h</tt>. <tt>key</tt> may be a string or
a simple <tt>char *</tt> string. Returns NULL if not found.
</div>
<p>
<b><tt>int Setattr(Hash *h, const String_or_char *key, const Object_or_char *val)</tt></b>
</p>
<div class="indent">
Stores <tt>val</tt> in <tt>h</tt>. <tt>key</tt> may be a string or
a simple <tt>char *</tt>. If <tt>val</tt> is not a standard
object (String, Hash, or List) it is assumed to be a <tt>char *</tt> in which
case it is used to construct a <tt>String</tt> that is stored in the hash.
If <tt>val</tt> is NULL, the object is deleted. Increases the reference count
of <tt>val</tt>. Returns 1 if this operation replaced an existing hash entry,
0 otherwise.
</div>
<p>
<b><tt>int Delattr(Hash *h, const String_or_char *key)</tt></b>
</p>
<div class="indent">
Deletes the hash item referenced by <tt>key</tt>. Decreases the
reference count on the corresponding object (if any). Returns 1
if an object was removed, 0 otherwise.
</div>
<p>
<b><tt>List *Keys(Hash *h)</tt></b>
</p>
<div class="indent">
Returns the list of hash table keys.
</div>
<H3><a name="Extending_nn17">39.5.3 Lists</a></H3>
<p>
<b><tt>List *NewList()</tt></b>
</p>
<div class="indent">
Creates a new empty list.
</div>
<p>
<b><tt>List *Copy(List *x)</tt></b>
</p>
<div class="indent">
Make a shallow copy of the List <tt>x</tt>.
</div>
<p>
<b><tt>void Delete(List *x)</tt></b>
</p>
<div class="indent">
Deletes <tt>x</tt>.
</div>
<p>
<b><tt>int Len(List *x)</tt></b>
</p>
<div class="indent">
Returns the number of items in <tt>x</tt>.
</div>
<p>
<b><tt>Object *Getitem(List *x, int n)</tt></b>
</p>
<div class="indent">
Returns an object from <tt>x</tt> with index <tt>n</tt>. If <tt>n</tt> is
beyond the end of the list, the last item is returned. If <tt>n</tt> is
negative, the first item is returned.
</div>
<p>
<b><tt>int *Setitem(List *x, int n, const Object_or_char *val)</tt></b>
</p>
<div class="indent">
Stores <tt>val</tt> in <tt>x</tt>.
If <tt>val</tt> is not a standard
object (String, Hash, or List) it is assumed to be a <tt>char *</tt> in which
case it is used to construct a <tt>String</tt> that is stored in the list.
<tt>n</tt> must be in range. Otherwise, an assertion will be raised.
</div>
<p>
<b><tt>int *Delitem(List *x, int n)</tt></b>
</p>
<div class="indent">
Deletes item <tt>n</tt> from the list, shifting items down if necessary.
To delete the last item in the list, use the special value <tt>DOH_END</tt>
for <tt>n</tt>.
</div>
<p>
<b><tt>void Append(List *x, const Object_or_char *t)</tt></b>
</p>
<div class="indent">
Appends <tt>t</tt> to the end of <tt>x</tt>. If <tt>t</tt> is not
a standard object, it is assumed to be a <tt>char *</tt> and is
used to create a String object.
</div>
<p>
<b><tt>void Insert(String *s, int pos, const Object_or_char *t)</tt></b>
</p>
<div class="indent">
Inserts <tt>t</tt> into <tt>s</tt> at position <tt>pos</tt>. The contents
of <tt>s</tt> are shifted accordingly. The special value <tt>DOH_END</tt>
can be used for <tt>pos</tt> to indicate insertion at the end of the list (appending).
If <tt>t</tt> is not a standard object, it is assumed to be a <tt>char *</tt>
and is used to create a String object.
</div>
<H3><a name="Extending_nn18">39.5.4 Common operations</a></H3>
The following operations are applicable to all datatypes.
<p>
<b><tt>Object *Copy(Object *x)</tt></b>
</p>
<div class="indent">
Make a copy of the object <tt>x</tt>.
</div>
<p>
<b><tt>void Delete(Object *x)</tt></b>
</p>
<div class="indent">
Deletes <tt>x</tt>.
</div>
<p>
<b><tt>void Setfile(Object *x, String_or_char *f)</tt></b>
</p>
<div class="indent">
Sets the filename associated with <tt>x</tt>. Used to track
objects and report errors.
</div>
<p>
<b><tt>String *Getfile(Object *x)</tt></b>
</p>
<div class="indent">
Gets the filename associated with <tt>x</tt>.
</div>
<p>
<b><tt>void Setline(Object *x, int n)</tt></b>
</p>
<div class="indent">
Sets the line number associated with <tt>x</tt>. Used to track
objects and report errors.
</div>
<p>
<b><tt>int Getline(Object *x)</tt></b>
</p>
<div class="indent">
Gets the line number associated with <tt>x</tt>.
</div>
<H3><a name="Extending_nn19">39.5.5 Iterating over Lists and Hashes</a></H3>
To iterate over the elements of a list or a hash table, the following functions are used:
<p>
<b><tt>Iterator First(Object *x)</tt></b>
</p>
<div class="indent">
Returns an iterator object that points to the first item in a list or hash table. The
<tt>item</tt> attribute of the Iterator object is a pointer to the item. For hash tables, the <tt>key</tt> attribute
of the Iterator object additionally points to the corresponding Hash table key. The <tt>item</tt> and <tt>key</tt> attributes
are NULL if the object contains no items or if there are no more items.
</div>
<p>
<b><tt>Iterator Next(Iterator i)</tt></b>
</p>
<div class="indent">
<p>Returns an iterator that points to the next item in a list or hash table.
Here are two examples of iteration:</p>
<div class="code">
<pre>
List *l = (some list);
Iterator i;
for (i = First(l); i.item; i = Next(i)) {
Printf(stdout, "%s\n", i.item);
}
Hash *h = (some hash);
Iterator j;
for (j = First(j); j.item; j= Next(j)) {
Printf(stdout, "%s : %s\n", j.key, j.item);
}
</pre>
</div>
</div>
<H3><a name="Extending_nn20">39.5.6 I/O</a></H3>
Special I/O functions are used for all internal I/O. These operations
work on C <tt>FILE *</tt> objects, String objects, and special <tt>File</tt> objects
(which are merely a wrapper around <tt>FILE *</tt>).
<p>
<b><tt>int Printf(String_or_FILE *f, const char *fmt, ...)</tt></b>
</p>
<div class="indent">
Formatted I/O. Same as the C <tt>fprintf()</tt> function except that output
can also be directed to a string object. Note: the <tt>%s</tt> format
specifier works with both strings and <tt>char *</tt>. All other format
operators have the same meaning.
</div>
<p>
<b><tt>int Printv(String_or_FILE *f, String_or_char *arg1, ..., NULL)</tt></b>
</p>
<div class="indent">
Prints a variable number of strings arguments to the output. The last
argument to this function must be NULL. The other arguments can either
be <tt>char *</tt> or string objects.
</div>
<p>
<b><tt>int Putc(int ch, String_or_FILE *f)</tt></b>
</p>
<div class="indent">
Same as the C <tt>fputc()</tt> function.
</div>
<p>
<b><tt>int Write(String_or_FILE *f, void *buf, int len)</tt></b>
</p>
<div class="indent">
Same as the C <tt>write()</tt> function.
</div>
<p>
<b><tt>int Read(String_or_FILE *f, void *buf, int maxlen)</tt></b>
</p>
<div class="indent">
Same as the C <tt>read()</tt> function.
</div>
<p>
<b><tt>int Getc(String_or_FILE *f)</tt></b>
</p>
<div class="indent">
Same as the C <tt>fgetc()</tt> function.
</div>
<p>
<b><tt>int Ungetc(int ch, String_or_FILE *f)</tt></b>
</p>
<div class="indent">
Same as the C <tt>ungetc()</tt> function.
</div>
<p>
<b><tt>int Seek(String_or_FILE *f, int offset, int whence)</tt></b>
</p>
<div class="indent">
Same as the C <tt>seek()</tt> function. <tt>offset</tt> is the number
of bytes. <tt>whence</tt> is one of <tt>SEEK_SET</tt>, <tt>SEEK_CUR</tt>,
or <tt>SEEK_END</tt>..
</div>
<p>
<b><tt>long Tell(String_or_FILE *f)</tt></b>
</p>
<div class="indent">
Same as the C <tt>tell()</tt> function.
</div>
<p>
<b><tt>File *NewFile(const char *filename, const char *mode, List *newfiles)</tt></b>
</p>
<div class="indent">
Create a File object using the <tt>fopen()</tt> library call. This
file differs from <tt>FILE *</tt> in that it can be placed in the standard
SWIG containers (lists, hashes, etc.). The <tt>filename</tt> is added to the
<tt>newfiles</tt> list if <tt>newfiles</tt> is non-zero and the file was created successfully.
</div>
<p>
<b><tt>File *NewFileFromFile(FILE *f)</tt></b>
</p>
<div class="indent">
Create a File object wrapper around an existing <tt>FILE *</tt> object.
</div>
<p>
There's no explicit function to close a file, just call <tt>Delete(f)</tt> -
this decreases the reference count, and the file will be closed when the
reference count reaches zero.
</p>
<p>
The use of the above I/O functions and strings play a critical role in SWIG. It is
common to see small code fragments of code generated using code like this:
</p>
<div class="code">
<pre>
/* Print into a string */
String *s = NewString("");
Printf(s, "Hello\n");
for (i = 0; i &lt; 10; i++) {
Printf(s, "%d\n", i);
}
...
/* Print string into a file */
Printf(f, "%s\n", s);
</pre>
</div>
<p>
Similarly, the preprocessor and parser all operate on string-files.
</p>
<H2><a name="Extending_nn21">39.6 Navigating and manipulating parse trees</a></H2>
Parse trees are built as collections of hash tables. Each node is a hash table in which
arbitrary attributes can be stored. Certain attributes in the hash table provide links to
other parse tree nodes. The following macros can be used to move around the parse tree.
<p>
<b><tt>String *nodeType(Node *n)</tt></b>
</p>
<div class="indent">
Returns the node type tag as a string. The returned string indicates the type of parse
tree node.
</div>
<p>
<b><tt>Node *nextSibling(Node *n)</tt></b>
</p>
<div class="indent">
Returns the next node in the parse tree. For example, the next C declaration.
</div>
<p>
<b><tt>Node *previousSibling(Node *n)</tt></b>
</p>
<div class="indent">
Returns the previous node in the parse tree. For example, the previous C declaration.
</div>
<p>
<b><tt>Node *firstChild(Node *n)</tt></b>
</p>
<div class="indent">
Returns the first child node. For example, if <tt>n</tt> was a C++ class node, this would
return the node for the first class member.
</div>
<p>
<b><tt>Node *lastChild(Node *n)</tt></b>
</p>
<div class="indent">
Returns the last child node. You might use this if you wanted to append a new
node to the children of a class.
</div>
<p>
<b><tt>Node *parentNode(Node *n)</tt></b>
</p>
<div class="indent">
Returns the parent of node <tt>n</tt>. Use this to move up the pass tree.
</div>
<p>
The following macros can be used to change all of the above attributes.
Normally, these functions are only used by the parser. Changing them without
knowing what you are doing is likely to be dangerous.
</p>
<p>
<b><tt>void set_nodeType(Node *n, const String_or_char)</tt></b>
</p>
<div class="indent">
Change the node type.
tree node.
</div>
<p>
<b><tt>void set_nextSibling(Node *n, Node *s)</tt></b>
</p>
<div class="indent">
Set the next sibling.
</div>
<p>
<b><tt>void set_previousSibling(Node *n, Node *s)</tt></b>
</p>
<div class="indent">
Set the previous sibling.
</div>
<p>
<b><tt>void set_firstChild(Node *n, Node *c)</tt></b>
</p>
<div class="indent">
Set the first child node.
</div>
<p>
<b><tt>void set_lastChild(Node *n, Node *c)</tt></b>
</p>
<div class="indent">
Set the last child node.
</div>
<p>
<b><tt>void set_parentNode(Node *n, Node *p)</tt></b>
</p>
<div class="indent">
Set the parent node.
</div>
<p>
The following utility functions are used to alter the parse tree (at your own risk)
</p>
<p>
<b><tt>void appendChild(Node *parent, Node *child)</tt></b>
</p>
<div class="indent">
Append a child to <tt>parent</tt>. The appended node becomes the last child.
</div>
<p>
<b><tt>void deleteNode(Node *node)</tt></b>
</p>
<div class="indent">
Deletes a node from the parse tree. Deletion reconnects siblings and properly updates
the parent so that sibling nodes are unaffected.
</div>
<H2><a name="Extending_nn22">39.7 Working with attributes</a></H2>
<p>
Since parse tree nodes are just hash tables, attributes are accessed using the <tt>Getattr()</tt>,
<tt>Setattr()</tt>, and <tt>Delattr()</tt> operations. For example:
</p>
<div class="code">
<pre>
int functionHandler(Node *n) {
String *name = Getattr(n, "name");
String *symname = Getattr(n, "sym:name");
SwigType *type = Getattr(n, "type");
...
}
</pre>
</div>
<p>
New attributes can be freely attached to a node as needed. However, when new attributes
are attached during code generation, they should be prepended with a namespace prefix.
For example:
</p>
<div class="code">
<pre>
...
Setattr(n, "python:docstring", doc); /* Store docstring */
...
</pre>
</div>
<p>
A quick way to check the value of an attribute is to use the <tt>checkAttribute()</tt> function like this:
</p>
<div class="code">
<pre>
if (checkAttribute(n, "storage", "virtual")) {
/* n is virtual */
...
}
</pre>
</div>
<p>
Changing the values of existing attributes is allowed and is sometimes done to implement
node transformations. However, if a function/method modifies a node, it is required to restore
modified attributes to their original values. To simplify the task of saving/restoring attributes,
the following functions are used:
</p>
<p>
<b><tt>int Swig_save(const char *ns, Node *n, const char *name1, const char *name2, ..., NIL)</tt></b>
</p>
<div class="indent">
Saves a copy of attributes <tt>name1</tt>, <tt>name2</tt>, etc. from node <tt>n</tt>.
Copies of the attributes are actually resaved in the node in a different namespace which is
set by the <tt>ns</tt> argument. For example, if you call <tt>Swig_save("foo", n, "type", NIL)</tt>,
then the "type" attribute will be copied and saved as "foo:type". The namespace name itself is stored in
the "view" attribute of the node. If necessary, this can be examined to find out where previous
values of attributes might have been saved.
</div>
<p>
<b><tt>int Swig_restore(Node *n)</tt></b>
</p>
<div class="indent">
<p>
Restores the attributes saved by the previous call to <tt>Swig_save()</tt>. Those
attributes that were supplied to <tt>Swig_save()</tt> will be restored to their
original values.
</p>
<p>
The <tt>Swig_save()</tt> and <tt>Swig_restore()</tt> functions must always be used as a pair.
That is, every call to <tt>Swig_save()</tt> must have a matching call to <tt>Swig_restore()</tt>.
Calls can be nested if necessary. Here is an example that shows how the functions might be used:
</p>
<div class="code">
<pre>
int variableHandler(Node *n) {
Swig_save("variableHandler", n, "type", "sym:name", NIL);
String *symname = Getattr(n, "sym:name");
SwigType *type = Getattr(n, "type");
...
Append(symname, "_global"); // Change symbol name
SwigType_add_pointer(type); // Add pointer
...
generate wrappers
...
Swig_restore(n); // Restore original values
return SWIG_OK;
}
</pre>
</div>
</div>
<p>
<b><tt>int Swig_require(const char *ns, Node *n, const char *name1, const char *name2, ..., NIL)</tt></b>
</p>
<div class="indent">
This is an enhanced version of <tt>Swig_save()</tt> that adds error checking. If an attribute
name is not present in <tt>n</tt>, a failed assertion results and SWIG terminates with a fatal
error. Optionally, if an attribute name is specified as "*<em>name</em>", a copy of the
attribute is saved as with <tt>Swig_save()</tt>. If an attribute is specified as "?<em>name</em>",
the attribute is optional. <tt>Swig_restore()</tt> must always be called after using this
function.
</div>
<H2><a name="Extending_nn23">39.8 Type system</a></H2>
<p>
SWIG implements the complete C++ type system including typedef, inheritance,
pointers, references, and pointers to members. A detailed discussion of
type theory is impossible here. However, let's cover the highlights.
</p>
<H3><a name="Extending_nn24">39.8.1 String encoding of types</a></H3>
<p>
All types in SWIG consist of a base datatype and a collection of type
operators that are applied to the base. A base datatype is almost
always some kind of primitive type such as <tt>int</tt> or <tt>double</tt>.
The operators consist of things like pointers, references, arrays, and so forth.
Internally, types are represented as strings that are constructed in a very
precise manner. Here are some examples:
</p>
<div class="diagram">
<pre>
C datatype SWIG encoding (strings)
----------------------------- --------------------------
int "int"
int * "p.int"
const int * "p.q(const).int"
int (*x)(int, double) "p.f(int, double).int"
int [20][30] "a(20).a(30).int"
int (F::*)(int) "m(F).f(int).int"
vector&lt;int&gt; * "p.vector&lt;(int)&gt;"
</pre>
</div>
<p>
Reading the SWIG encoding is often easier than figuring out the C code---just
read it from left to right. For a type of "p.f(int, double).int" is
a "pointer to a function(int, double) that returns int".
</p>
<p>
The following operator encodings are used in type strings:
</p>
<div class="diagram">
<pre>
Operator Meaning
------------------- -------------------------------
p. Pointer to
a(n). Array of dimension n
r. C++ reference
m(class). Member pointer to class
f(args). Function.
q(qlist). Qualifiers
</pre>
</div>
<p>
In addition, type names may be parameterized by templates. This is
represented by enclosing the template parameters in <tt>&lt;(
... )&gt;</tt>. Variable length arguments are represented by the
special base type of <tt>v(...)</tt>.
</p>
<p>
If you want to experiment with type encodings, the raw type strings can
be inserted into an interface file using backticks `` wherever a type
is expected. For instance, here is
an extremely perverted example:
</p>
<div class="diagram">
<pre>
`p.a(10).p.f(int, p.f(int).int)` foo(int, int (*x)(int));
</pre>
</div>
<p>
This corresponds to the immediately obvious C declaration:
</p>
<div class="diagram">
<pre>
(*(*foo(int, int (*)(int)))[10])(int, int (*)(int));
</pre>
</div>
<p>
Aside from the potential use of this declaration on a C programming quiz,
it motivates the use of the special SWIG encoding of types. The SWIG
encoding is much easier to work with because types can be easily examined,
modified, and constructed using simple string operations (comparison,
substrings, concatenation, etc.). For example, in the parser, a declaration
like this
</p>
<div class="code">
<pre>
int *a[30];
</pre>
</div>
<p>
is processed in a few pieces. In this case, you have the base type
"<tt>int</tt>" and the declarator of type "<tt>a(30).p.</tt>". To
make the final type, the two parts are just joined together using
string concatenation.
</p>
<H3><a name="Extending_nn25">39.8.2 Type construction</a></H3>
<p>
The following functions are used to construct types. You should use
these functions instead of trying to build the type strings yourself.
</p>
<p>
<b><tt>void SwigType_add_pointer(SwigType *ty)</tt></b>
</p>
<div class="indent">
Adds a pointer to <tt>ty</tt>.
</div>
<p>
<b><tt>void SwigType_del_pointer(SwigType *ty)</tt></b>
</p>
<div class="indent">
Removes a single pointer from <tt>ty</tt>.
</div>
<p>
<b><tt>void SwigType_add_reference(SwigType *ty)</tt></b>
</p>
<div class="indent">
Adds a reference to <tt>ty</tt>.
</div>
<p>
<b><tt>void SwigType_add_array(SwigType *ty, const String_or_char *size)</tt></b>
</p>
<div class="indent">
Adds an array with dimension <tt>dim</tt> to <tt>ty</tt>.
</div>
<p>
<b><tt>void SwigType_del_array(SwigType *ty)</tt></b>
</p>
<div class="indent">
Removes a single array dimension from <tt>ty</tt>.
</div>
<p>
<b><tt>int SwigType_array_ndim(SwigType *ty)</tt></b>
</p>
<div class="indent">
Returns number of array dimensions of <tt>ty</tt>.
</div>
<p>
<b><tt>String* SwigType_array_getdim(SwigType *ty, int n)</tt></b>
</p>
<div class="indent">
Returns <tt>n</tt>th array dimension of <tt>ty</tt>.
</div>
<p>
<b><tt>void SwigType_array_setdim(SwigType *ty, int n, const String_or_char *rep)</tt></b>
</p>
<div class="indent">
Sets <tt>n</tt>th array dimensions of <tt>ty</tt> to <tt>rep</tt>.
</div>
<p>
<b><tt>void SwigType_add_qualifier(SwigType *ty, const String_or_char *q)</tt></b>
</p>
<div class="indent">
Adds a type qualifier <tt>q</tt> to <tt>ty</tt>. <tt>q</tt> is typically
<tt>"const"</tt> or <tt>"volatile"</tt>.
</div>
<p>
<b><tt>void SwigType_add_memberpointer(SwigType *ty, const String_or_char *cls)</tt></b>
</p>
<div class="indent">
Adds a pointer to a member of class <tt>cls</tt> to <tt>ty</tt>.
</div>
<p>
<b><tt>void SwigType_add_function(SwigType *ty, ParmList *p)</tt></b>
</p>
<div class="indent">
Adds a function to <tt>ty</tt>. <tt>p</tt> is a linked-list of parameter
nodes as generated by the parser. See the section on parameter lists
for details about the representation.
</div>
<p>
<b><tt>void SwigType_add_template(SwigType *ty, ParmList *p)</tt></b>
</p>
<div class="indent">
Adds a template to <tt>ty</tt>. <tt>p</tt> is a linked-list of parameter
nodes as generated by the parser. See the section on parameter lists
for details about the representation.
</div>
<p>
<b><tt>SwigType *SwigType_pop(SwigType *ty)</tt></b>
</p>
<div class="indent">
Removes the last type constructor from <tt>ty</tt> and returns it.
<tt>ty</tt> is modified.
</div>
<p>
<b><tt>void SwigType_push(SwigType *ty, SwigType *op)</tt></b>
</p>
<div class="indent">
Pushes the type operators in <tt>op</tt> onto type <tt>ty</tt>. The
opposite of <tt>SwigType_pop()</tt>.
</div>
<p>
<b><tt>SwigType *SwigType_pop_arrays(SwigType *ty)</tt></b>
</p>
<div class="indent">
Removes all leading array operators from <tt>ty</tt> and returns them.
<tt>ty</tt> is modified. For example, if <tt>ty</tt> is <tt>"a(20).a(10).p.int"</tt>,
then this function would return <tt>"a(20).a(10)."</tt> and modify <tt>ty</tt>
so that it has the value <tt>"p.int"</tt>.
</div>
<p>
<b><tt>SwigType *SwigType_pop_function(SwigType *ty)</tt></b>
</p>
<div class="indent">
Removes a function operator from <tt>ty</tt> including any qualification.
<tt>ty</tt> is modified. For example, if <tt>ty</tt> is <tt>"f(int).int"</tt>,
then this function would return <tt>"f(int)."</tt> and modify <tt>ty</tt>
so that it has the value <tt>"int"</tt>.
</div>
<p>
<b><tt>SwigType *SwigType_base(SwigType *ty)</tt></b>
</p>
<div class="indent">
Returns the base type of a type. For example, if <tt>ty</tt> is
<tt>"p.a(20).int"</tt>, this function would return <tt>"int"</tt>.
<tt>ty</tt> is unmodified.
</div>
<p>
<b><tt>SwigType *SwigType_prefix(SwigType *ty)</tt></b>
</p>
<div class="indent">
Returns the prefix of a type. For example, if <tt>ty</tt> is
<tt>"p.a(20).int"</tt>, this function would return <tt>"p.a(20)."</tt>.
<tt>ty</tt> is unmodified.
</div>
<H3><a name="Extending_nn26">39.8.3 Type tests</a></H3>
<p>
The following functions can be used to test properties of a datatype.
</p>
<p>
<b><tt>int SwigType_ispointer(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> is a standard pointer.
</div>
<p>
<b><tt>int SwigType_ismemberpointer(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> is a member pointer.
</div>
<p>
<b><tt>int SwigType_isreference(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> is a C++ reference.
</div>
<p>
<b><tt>int SwigType_isarray(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> is an array.
</div>
<p>
<b><tt>int SwigType_isfunction(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> is a function.
</div>
<p>
<b><tt>int SwigType_isqualifier(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> is a qualifier.
</div>
<p>
<b><tt>int SwigType_issimple(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> is a simple type. No operators applied.
</div>
<p>
<b><tt>int SwigType_isconst(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> is a const type.
</div>
<p>
<b><tt>int SwigType_isvarargs(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> is a varargs type.
</div>
<p>
<b><tt>int SwigType_istemplate(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> is a templatized type.
</div>
<H3><a name="Extending_nn27">39.8.4 Typedef and inheritance</a></H3>
<p>
The behavior of <tt>typedef</tt> declaration is to introduce a type alias.
For instance, <tt>typedef int Integer</tt> makes the identifier
<tt>Integer</tt> an alias for <tt>int</tt>. The treatment of typedef in
SWIG is somewhat complicated due to the pattern matching rules that get applied
in typemaps and the fact that SWIG prefers to generate wrapper code
that closely matches the input to simplify debugging (a user will see the
typedef names used in their program instead of the low-level primitive C
datatypes).
</p>
<p>
To handle <tt>typedef</tt>, SWIG builds a collection of trees containing typedef relations. For example,
</p>
<div class="code">
<pre>
typedef int Integer;
typedef Integer *IntegerPtr;
typedef int Number;
typedef int Size;
</pre>
</div>
<p>
produces two trees like this:
</p>
<div class="diagram">
<pre>
int p.Integer
^ ^ ^ ^
/ | \ |
/ | \ |
Integer Size Number IntegerPtr
</pre>
</div>
<p>
To resolve a single typedef relationship, the following function is used:
</p>
<p>
<b><tt>SwigType *SwigType_typedef_resolve(SwigType *ty)</tt></b>
</p>
<div class="indent">
Checks if <tt>ty</tt> can be reduced to a new type via typedef. If so,
returns the new type. If not, returns NULL.
</div>
<p>
Typedefs are only resolved in simple typenames that appear in a type.
For example, the type base name and in function parameters. When
resolving types, the process starts in the leaf nodes and moves up
the tree towards the root. Here are a few examples that show how it works:
</p>
<div class="diagram">
<pre>
Original type After typedef_resolve()
------------------------ -----------------------
Integer int
a(30).Integer int
p.IntegerPtr p.p.Integer
p.p.Integer p.p.int
</pre>
</div>
<p>
For complicated types, the process can be quite involved. Here is the
reduction of a function pointer:
</p>
<div class="diagram">
<pre>
p.f(Integer, p.IntegerPtr, Size).Integer : Start
p.f(Integer, p.IntegerPtr, Size).int
p.f(int, p.IntegerPtr, Size).int
p.f(int, p.p.Integer, Size).int
p.f(int, p.p.int, Size).int
p.f(int, p.p.int, int).int : End
</pre>
</div>
<p>
Two types are equivalent if their full type reductions are the same.
The following function will fully reduce a datatype:
</p>
<p>
<b><tt>SwigType *SwigType_typedef_resolve_all(SwigType *ty)</tt></b>
</p>
<div class="indent">
Fully reduces <tt>ty</tt> according to typedef rules. Resulting datatype
will consist only of primitive typenames.
</div>
<H3><a name="Extending_nn28">39.8.5 Lvalues</a></H3>
<p>
When generating wrapper code, it is necessary to emit datatypes that can
be used on the left-hand side of an assignment operator (an lvalue). However,
not all C datatypes can be used in this way---especially arrays and
const-qualified types. To generate a type that can be used as an lvalue,
use the following function:
</p>
<p>
<b><tt>SwigType *SwigType_ltype(SwigType *ty)</tt></b>
</p>
<div class="indent">
Converts type <tt>ty</tt> to a type that can be used as an lvalue in
assignment. The resulting type is stripped of qualifiers and arrays are
converted to a pointers.
</div>
<p>
The creation of lvalues is fully aware of typedef and other aspects
of the type system. Therefore, the creation of an lvalue may result in
unexpected results. Here are a few examples:
</p>
<div class="code">
<pre>
typedef double Matrix4[4][4];
Matrix4 x; // type = 'Matrix4', ltype='p.a(4).double'
typedef const char * Literal;
Literal y; // type = 'Literal', ltype='p.char'
</pre>
</div>
<H3><a name="Extending_nn29">39.8.6 Output functions</a></H3>
<p>
The following functions produce strings that are suitable for output.
</p>
<p>
<b><tt>String *SwigType_str(SwigType *ty, const String_or_char *id = 0)</tt></b>
</p>
<div class="indent">
Generates a C string for a datatype. <tt>id</tt> is an optional declarator.
For example, if <tt>ty</tt> is "p.f(int).int" and <tt>id</tt> is "foo", then
this function produces "<tt>int (*foo)(int)</tt>". This function is
used to convert string-encoded types back into a form that is valid C syntax.
</div>
<p>
<b><tt>String *SwigType_lstr(SwigType *ty, const String_or_char *id = 0)</tt></b>
</p>
<div class="indent">
This is the same as <tt>SwigType_str()</tt> except that the result
is generated from the type's lvalue (as generated from SwigType_ltype).
</div>
<p>
<b><tt>String *SwigType_lcaststr(SwigType *ty, const String_or_char *id = 0)</tt></b>
</p>
<div class="indent">
Generates a casting operation that converts from type <tt>ty</tt> to its
lvalue. <tt>id</tt> is an optional name to include in the cast. For example,
if <tt>ty</tt> is "<tt>q(const).p.char</tt>" and <tt>id</tt> is "<tt>foo</tt>",
this function produces the string "<tt>(char *) foo</tt>".
</div>
<p>
<b><tt>String *SwigType_rcaststr(SwigType *ty, const String_or_char *id = 0)</tt></b>
</p>
<div class="indent">
Generates a casting operation that converts from a type's lvalue to a
type equivalent to <tt>ty</tt>. <tt>id</tt> is an optional name to
include in the cast. For example, if <tt>ty</tt> is
"<tt>q(const).p.char</tt>" and <tt>id</tt> is "<tt>foo</tt>", this
function produces the string "<tt>(const char *) foo</tt>".
</div>
<p>
<b><tt>String *SwigType_manglestr(SwigType *ty)</tt></b>
</p>
<div class="indent">
Generates a mangled string encoding of type <tt>ty</tt>. The
mangled string only contains characters that are part of a valid
C identifier. The resulting string is used in various parts of
SWIG, but is most commonly associated with type-descriptor objects
that appear in wrappers (e.g., <tt>SWIGTYPE_p_double</tt>).
</div>
<H2><a name="Extending_nn30">39.9 Parameters</a></H2>
<p>
Several type-related functions involve parameter lists. These include
functions and templates. Parameter list are represented as a list of
nodes with the following attributes:
</p>
<div class="diagram">
<pre>
"type" - Parameter type (required)
"name" - Parameter name (optional)
"value" - Initializer (optional)
</pre>
</div>
<p>
Typically parameters are denoted in the source by using a typename of
<tt>Parm *</tt> or <tt>ParmList *</tt>. To walk a parameter list, simply use
code like this:
</p>
<div class="diagram">
<pre>
Parm *parms;
Parm *p;
for (p = parms; p; p = nextSibling(p)) {
SwigType *type = Getattr(p, "type");
String *name = Getattr(p, "name");
String *value = Getattr(p, "value");
...
}
</pre>
</div>
<p>
Note: this code is exactly the same as what you would use to walk parse tree nodes.
</p>
<p>
An empty list of parameters is denoted by a NULL pointer.
</p>
<p>
Since parameter lists are fairly common, the following utility functions are provided
to manipulate them:
</p>
<p>
<b><tt>Parm *CopyParm(Parm *p);</tt></b>
</p>
<div class="indent">
Copies a single parameter.
</div>
<p>
<b><tt>ParmList *CopyParmList(ParmList *p);</tt></b>
</p>
<div class="indent">
Copies an entire list of parameters.
</div>
<p>
<b><tt>int ParmList_len(ParmList *p);</tt></b>
</p>
<div class="indent">
Returns the number of parameters in a parameter list.
</div>
<p>
<b><tt>String *ParmList_str(ParmList *p);</tt></b>
</p>
<div class="indent">
Converts a parameter list into a C string. For example,
produces a string like "<tt>(int *p, int n, double x);</tt>".
</div>
<p>
<b><tt>String *ParmList_protostr(ParmList *p);</tt></b>
</p>
<div class="indent">
The same as <tt>ParmList_str()</tt> except that parameter names are not
included. Used to emit prototypes.
</div>
<p>
<b><tt>int ParmList_numrequired(ParmList *p);</tt></b>
</p>
<div class="indent">
Returns the number of required (non-optional) arguments in <tt>p</tt>.
</div>
<H2><a name="Extending_nn31">39.10 Writing a Language Module</a></H2>
<p>
One of the easiest routes to supporting a new language module is to copy an already
supported language module implementation and modify it.
Be sure to choose a language that is similar in nature to the new language.
All language modules follow a similar structure and
this section briefly outlines the steps needed to create a bare-bones
language module from scratch.
Since the code is relatively easy to read, this section
describes the creation of a minimal Python module. You should be able to extrapolate
this to other languages.
</p>
<H3><a name="Extending_nn32">39.10.1 Execution model</a></H3>
<p>
Code generation modules are defined by inheriting from the <tt>Language</tt> class,
currently defined in the <tt>Source/Modules</tt> directory of SWIG. Starting from
the parsing of command line options, all aspects of code generation are controlled by
different methods of the <tt>Language</tt> that must be defined by your module.
</p>
<H3><a name="Extending_starting_out">39.10.2 Starting out</a></H3>
<p>
To define a new language module, first create a minimal implementation using
this example as a guide:
</p>
<div class="code">
<pre>
#include "swigmod.h"
class PYTHON : public Language {
public:
virtual void main(int argc, char *argv[]) {
printf("I'm the Python module.\n");
}
virtual int top(Node *n) {
printf("Generating code.\n");
return SWIG_OK;
}
};
extern "C" Language *
swig_python(void) {
return new PYTHON();
}
</pre>
</div>
<p>
The "swigmod.h" header file contains, among other things, the declaration
of the <tt>Language</tt> base class and so you should include it at the top
of your language module's source file. Similarly, the "swigconfig.h" header
file contains some other useful definitions that you may need. Note that you
should <em>not</em> include any header files that are installed with the
target language. That is to say, the implementation of the SWIG Python module
shouldn't have any dependencies on the Python header files. The wrapper code
generated by SWIG will almost always depend on some language-specific C/C++
header files, but SWIG itself does not.
</p>
<p>
Give your language class a reasonable name, usually the same as the target language.
By convention, these class names are all uppercase (e.g. "PYTHON" for the Python
language module) but this is not a requirement. This class will ultimately consist
of a number of overrides of the virtual functions declared in the <tt>Language</tt>
base class, in addition to any language-specific member functions and data you
need. For now, just use the dummy implementations shown above.
</p>
<p>
The language module ends with a factory function, <tt>swig_python()</tt>, that simply
returns a new instance of the language class. As shown, it should be declared with the
<tt>extern "C"</tt> storage qualifier so that it can be called from C code. It should
also return a pointer to the base class (<tt>Language</tt>) so that only the interface
(and not the implementation) of your language module is exposed to the rest of SWIG.
</p>
<p>
Save the code for your language module in a file named "<tt>python.cxx</tt>" and
place this file in the <tt>Source/Modules</tt> directory of the SWIG distribution.
To ensure that your module is compiled into SWIG along with the other language modules,
modify the file <tt>Source/Makefile.am</tt> to include the additional source
files. In addition, modify the file <tt>Source/Modules/swigmain.cxx</tt>
with an additional command line option that activates the module. Read the source---it's straightforward.
</p>
<p>
Next, at the top level of the SWIG distribution, re-run the <tt>autogen.sh</tt> script
to regenerate the various build files:
</p>
<div class="shell">
<pre>
$ <b>./autogen.sh</b>
</pre>
</div>
<p>
Next re-run <tt>configure</tt> to regenerate all of the Makefiles:
</p>
<div class="shell">
<pre>
$ <b>./configure</b>
</pre>
</div>
<p>
Finally, rebuild SWIG with your module added:
</p>
<div class="shell">
<pre>
$ <b>make</b>
</pre>
</div>
<p>
Once it finishes compiling, try running SWIG with the command-line option
that activates your module. For example, <tt>swig -python foo.i</tt>. The
messages from your new module should appear.
</p>
<H3><a name="Extending_nn34">39.10.3 Command line options</a></H3>
<p>
When SWIG starts, the command line options are passed to your language module. This occurs
before any other processing occurs (preprocessing, parsing, etc.). To capture the
command line options, simply use code similar to this:
</p>
<div class="code">
<pre>
void Language::main(int argc, char *argv[]) {
for (int i = 1; i &lt; argc; i++) {
if (argv[i]) {
if (strcmp(argv[i], "-interface") == 0) {
if (argv[i+1]) {
interface = NewString(argv[i+1]);
Swig_mark_arg(i);
Swig_mark_arg(i+1);
i++;
} else {
Swig_arg_error();
}
} else if (strcmp(argv[i], "-globals") == 0) {
if (argv[i+1]) {
global_name = NewString(argv[i+1]);
Swig_mark_arg(i);
Swig_mark_arg(i+1);
i++;
} else {
Swig_arg_error();
}
} else if ((strcmp(argv[i], "-proxy") == 0)) {
proxy_flag = 1;
Swig_mark_arg(i);
} else if (strcmp(argv[i], "-keyword") == 0) {
use_kw = 1;
Swig_mark_arg(i);
} else if (strcmp(argv[i], "-help") == 0) {
fputs(usage, stderr);
}
...
}
}
}
</pre>
</div>
<p>
The exact set of options depends on what you want to do in your module. Generally,
you would use the options to change code generation modes or to print diagnostic information.
</p>
<p>
If a module recognizes an option, it should always call <tt>Swig_mark_arg()</tt>
to mark the option as valid. If you forget to do this, SWIG will terminate with an
unrecognized command line option error.
</p>
<H3><a name="Extending_nn35">39.10.4 Configuration and preprocessing</a></H3>
<p>
In addition to looking at command line options, the <tt>main()</tt> method is responsible
for some initial configuration of the SWIG library and preprocessor. To do this,
insert some code like this:
</p>
<div class="code">
<pre>
void main(int argc, char *argv[]) {
... command line options ...
/* Set language-specific subdirectory in SWIG library */
SWIG_library_directory("python");
/* Set language-specific preprocessing symbol */
Preprocessor_define("SWIGPYTHON 1", 0);
/* Set language-specific configuration file */
SWIG_config_file("python.swg");
/* Set typemap language (historical) */
SWIG_typemap_lang("python");
}
</pre>
</div>
<p>
The above code does several things--it registers the name of the
language module with the core, it supplies some preprocessor macro definitions
for use in input files (so that they can determine the target language), and
it registers a start-up file. In this case, the file <tt>python.swg</tt> will
be parsed before any part of the user-supplied input file.
</p>
<p>
Before proceeding any further, create a directory for your module in the SWIG
library (The <tt>Lib</tt> directory). Now, create a configuration file in the
directory. For example, <tt>python.swg</tt>.
</p>
<p>
Just to review, your language module should now consist of two files--
an implementation file <tt>python.cxx</tt> and a configuration file
<tt>python.swg</tt>.
</p>
<H3><a name="Extending_nn36">39.10.5 Entry point to code generation</a></H3>
<p>
SWIG is a multi-pass compiler. Once the <tt>main()</tt> method has
been invoked, the language module does not execute again until
preprocessing, parsing, and a variety of semantic analysis passes have
been performed. When the core is ready to start generating wrappers,
it invokes the <tt>top()</tt> method of your language class. The
argument to <tt>top</tt> is a single parse tree node that corresponds to
the top of the entire parse tree.
</p>
<p>
To get the code generation process started, the <tt>top()</tt> procedure needs
to do several things:
</p>
<ul>
<li>Initialize the wrapper code output.
<li>Set the module name.
<li>Emit common initialization code.
<li>Emit code for all of the child nodes.
<li>Finalize the wrapper module and cleanup.
</ul>
<p>
An outline of <tt>top()</tt> might be as follows:
</p>
<div class="code">
<pre>
int Python::top(Node *n) {
/* Get the module name */
String *module = Getattr(n, "name");
/* Get the output file name */
String *outfile = Getattr(n, "outfile");
/* Initialize I/O (see next section) */
...
/* Output module initialization code */
...
/* Emit code for children */
Language::top(n);
...
/* Cleanup files */
...
return SWIG_OK;
}
</pre>
</div>
<H3><a name="Extending_nn37">39.10.6 Module I/O and wrapper skeleton</a></H3>
<!-- please report bugs in this section to mgossage -->
<p>
Within SWIG wrappers, there are five main sections. These are (in order)
</p>
<ul>
<li>begin: This section is a placeholder for users to put code at the beginning of the C/C++ wrapper file.
<li>runtime: This section has most of the common SWIG runtime code.
<li>header: This section holds declarations and inclusions from the .i file.
<li>wrapper: This section holds all the wrapper code.
<li>init: This section holds the module initialisation function
(the entry point for the interpreter).
</ul>
<p>
Different parts of the SWIG code will fill different sections,
then upon completion of the wrappering all the sections will be saved
to the wrapper file.
</p>
<p>
To perform this will require several additions to the code in various places,
such as:
</p>
<div class="code">
<pre>
class PYTHON : public Language {
protected:
/* General DOH objects used for holding the strings */
File *f_begin;
File *f_runtime;
File *f_header;
File *f_wrappers;
File *f_init;
public:
...
};
int Python::top(Node *n) {
...
/* Initialize I/O */
f_begin = NewFile(outfile, "w", SWIG_output_files());
if (!f_begin) {
FileErrorDisplay(outfile);
SWIG_exit(EXIT_FAILURE);
}
f_runtime = NewString("");
f_init = NewString("");
f_header = NewString("");
f_wrappers = NewString("");
/* Register file targets with the SWIG file handler */
Swig_register_filebyname("begin", f_begin);
Swig_register_filebyname("header", f_header);
Swig_register_filebyname("wrapper", f_wrappers);
Swig_register_filebyname("runtime", f_runtime);
Swig_register_filebyname("init", f_init);
/* Output module initialization code */
Swig_banner(f_begin);
...
/* Emit code for children */
Language::top(n);
...
/* Write all to the file */
Dump(f_runtime, f_begin);
Dump(f_header, f_begin);
Dump(f_wrappers, f_begin);
Wrapper_pretty_print(f_init, f_begin);
/* Cleanup files */
Delete(f_runtime);
Delete(f_header);
Delete(f_wrappers);
Delete(f_init);
Delete(f_begin);
return SWIG_OK;
}
</pre>
</div>
<p>
Using this to process a file will generate a wrapper file, however the
wrapper will only consist of the common SWIG code as well as any inline
code which was written in the .i file. It does not contain any wrappers for
any of the functions or classes.
</p>
<p>
The code to generate the wrappers are the various member functions, which
currently have not been touched. We will look at <tt>functionWrapper()</tt> as this
is the most commonly used function. In fact many of the other wrapper routines
will call this to do their work.
</p>
<p>
A simple modification to write some basic details to the wrapper looks like this:
</p>
<div class="code">
<pre>
int Python::functionWrapper(Node *n) {
/* Get some useful attributes of this function */
String *name = Getattr(n, "sym:name");
SwigType *type = Getattr(n, "type");
ParmList *parms = Getattr(n, "parms");
String *parmstr= ParmList_str_defaultargs(parms); // to string
String *func = SwigType_str(type, NewStringf("%s(%s)", name, parmstr));
String *action = Getattr(n, "wrap:action");
Printf(f_wrappers, "functionWrapper : %s\n", func);
Printf(f_wrappers, " action : %s\n", action);
return SWIG_OK;
}
</pre>
</div>
<p>
This will now produce some useful information within your wrapper file.
</p>
<div class="shell">
<pre>
functionWrapper : void delete_Shape(Shape *self)
action : delete arg1;
functionWrapper : void Shape_x_set(Shape *self, double x)
action : if (arg1) (arg1)-&gt;x = arg2;
functionWrapper : double Shape_x_get(Shape *self)
action : result = (double) ((arg1)-&gt;x);
functionWrapper : void Shape_y_set(Shape *self, double y)
action : if (arg1) (arg1)-&gt;y = arg2;
...
</pre>
</div>
<H3><a name="Extending_nn38">39.10.7 Low-level code generators</a></H3>
<!-- please report bugs in this section to mgossage -->
<p>
As ingenious as SWIG is, and despite all its capabilities and the power of
its parser, the Low-level code generation takes a lot of work to write
properly. Mainly because every language insists on its own manner of
interfacing to C/C++. To write the code generators you will need a good
understanding of how to manually write an interface to your chosen
language, so make sure you have your documentation handy.
</p>
<p>
At this point it is also probably a good idea to take a very simple file
(just one function), and try letting SWIG generate wrappers for many
different languages. Take a look at all of the wrappers generated, and decide
which one looks closest to the language you are trying to wrap.
This may help you to decide which code to look at.
</p>
<p>
In general most language wrappers look a little like this:
</p>
<div class="code">
<pre>
/* wrapper for TYPE3 some_function(TYPE1, TYPE2); */
RETURN_TYPE _wrap_some_function(ARGS){
TYPE1 arg1;
TYPE2 arg2;
TYPE3 result;
if(ARG1 is not of TYPE1) goto fail;
arg1=(convert ARG1);
if(ARG2 is not of TYPE2) goto fail;
arg2=(convert ARG2);
result=some_function(arg1, arg2);
convert 'result' to whatever the language wants;
do any tidy up;
return ALL_OK;
fail:
do any tidy up;
return ERROR;
}
</pre>
</div>
<p>
Yes, it is rather vague and not very clear. But each language works differently
so this will have to do for now.
</p>
<p>
Tackling this problem will be done in two stages:
</p>
<ul>
<li>The skeleton: the function wrapper, and call, but without the conversion
<li>The conversion: converting the arguments to-from what the language wants
</ul>
<p>
The first step will be done in the code, the second will be done in typemaps.
</p>
<p>
Our first step will be to write the code for <tt>functionWrapper()</tt>. What is
shown below is <b>NOT</b> the solution, merely a step in the right direction.
There are a lot of issues to address.
</p>
<ul>
<li>Variable length and default parameters
<li>Typechecking and number of argument checks
<li>Overloaded functions
<li>Inout and Output only arguments
</ul>
<div class="code">
<pre>
virtual int functionWrapper(Node *n) {
/* get useful attributes */
String *name = Getattr(n, "sym:name");
SwigType *type = Getattr(n, "type");
ParmList *parms = Getattr(n, "parms");
...
/* create the wrapper object */
Wrapper *wrapper = NewWrapper();
/* create the functions wrappered name */
String *wname = Swig_name_wrapper(iname);
/* deal with overloading */
....
/* write the wrapper function definition */
Printv(wrapper-&gt;def, "RETURN_TYPE ", wname, "(ARGS) {", NIL);
/* if any additional local variable needed, add them now */
...
/* write the list of locals/arguments required */
emit_args(type, parms, wrapper);
/* check arguments */
...
/* write typemaps(in) */
....
/* write constraints */
....
/* Emit the function call */
emit_action(n, wrapper);
/* return value if necessary */
....
/* write typemaps(out) */
....
/* add cleanup code */
....
/* Close the function(ok) */
Printv(wrapper-&gt;code, "return ALL_OK;\n", NIL);
/* add the failure cleanup code */
...
/* Close the function(error) */
Printv(wrapper-&gt;code, "return ERROR;\n", "}\n", NIL);
/* final substitutions if applicable */
...
/* Dump the function out */
Wrapper_print(wrapper, f_wrappers);
/* tidy up */
Delete(wname);
DelWrapper(wrapper);
return SWIG_OK;
}
</pre>
</div>
<p>
Executing this code will produce wrappers which have our basic skeleton
but without the typemaps, there is still work to do.
</p>
<H3><a name="Extending_configuration_files">39.10.8 Configuration files</a></H3>
<!-- please report bugs in this section to ttn -->
<p>
At the time of this writing, SWIG supports nearly twenty languages,
which means that for continued sanity in maintaining the configuration
files, the language modules need to follow some conventions. These are
outlined here along with the admission that, yes it is ok to violate
these conventions in minor ways, as long as you know where to apply the
proper kludge to keep the overall system regular and running.
Engineering is the art of compromise, see...
</p>
<p>
Much of the maintenance regularity depends on choosing a suitable
nickname for your language module (and then using it in a controlled
way). Nicknames should be all lower case letters with an optional
numeric suffix (no underscores, no dashes, no spaces). Some examples
are: <TT>foo</TT>, <TT>bar</TT>, <TT>qux99</TT>.
</p>
<p>
The numeric suffix variant, as in the last example, is somewhat tricky
to work with because sometimes people expect to refer to the language
without this number but sometimes that number is extremely relevant
(especially when it corresponds to language implementation versions with
incompatible interfaces). New language modules that unavoidably require
a numeric suffix in their nickname should include that number in all
uses, or be prepared to kludge.
</p>
<p>
The nickname is used in four places:
</p>
<TABLE summary="nickname table">
<TR><TD><B>usage</B></TD><TD><B>transform</B></TD></TR>
<TR><TD>"skip" tag</TD><TD>(none)</TD></TR>
<TR><TD>Examples/ subdir name</TD><TD>(none)</TD></TR>
<TR><TD>Examples/test-suite/ subdir name</TD><TD>(none)</TD></TR>
<!-- add more uses here (remember to adjust header) -->
</TABLE>
<p>
As you can see, most usages are direct.
</p>
<dl>
<dt> <b>configure.ac</b>
<dd> This file is processed by
<p>
<A HREF="http://www.gnu.org/software/autoconf/">autoconf</A>
to generate the <TT>configure</TT> script. This is where you
need to add shell script fragments and autoconf macros to detect the
presence of whatever development support your language module requires,
typically directories where headers and libraries can be found, and/or
utility programs useful for integrating the generated wrapper code.
</p>
<p>
Use the <TT>AC_ARG_WITH</TT>, <TT>AC_MSG_CHECKING</TT>, <TT>AC_SUBST</TT>
macros and so forth (see other languages for examples). Avoid using the
<TT>[</TT> and <TT>]</TT> character in shell script fragments. The
variable names passed to <TT>AC_SUBST</TT> should begin with the nickname,
entirely upcased.
</p>
<p>
At the end of the new section is the place to put the aforementioned
nickname kludges (should they be needed). See Perl5 for
examples of what to do. [If this is still unclear after you've read
the code, ping me and I'll expand on this further. --ttn]
</p>
<dt> <b>Makefile.in</b>
<dd>
<p>
Some of the variables AC_SUBSTituted are essential to the
support of your language module. Fashion these into a shell script
"test" clause and assign that to a skip tag using "-z" and "-o":
</p>
<div class="code"><tt>
skip-qux99 = [ -z "@QUX99INCLUDE@" -o -z "@QUX99LIBS" ]
</tt></div>
<p>
This means if those vars should ever be empty, qux99 support should
be considered absent and so it would be a good idea to skip actions that
might rely on it.
</p>
<p>
Here is where you may also define an alias (but then you'll need to
kludge --- don't do this):
</p>
<div class="code"><tt>
skip-qux = $(skip-qux99)
</tt></div>
<p>
Lastly, you need to modify each of <TT>check-aliveness</TT>,
<TT>check-examples</TT>, <TT>check-test-suite</TT>
and <TT>lib-languages</TT> (var).
Use the nickname for these, not the alias.
Note that you can do this even before you have any tests or examples
set up; the Makefile rules do some sanity checking and skip around
these kinds of problems.
</p>
<dt> <b>Examples/Makefile.in</b>
<dd> Nothing special here; see comments at the top of this file
and look to the existing languages for examples.
<dt> <b>Examples/qux99/check.list</b>
<dd> Do <TT>cp ../python/check.list .</TT> and modify to taste.
One subdir per line.
<dt> <b>Lib/qux99/extra-install.list</b>
<dd> If you add your language to the top-level Makefile.in var
<TT>lib-languages</TT>, then <TT>make install</TT> will install
all <TT>*.i</TT> and <TT>*.swg</TT> files from the language-specific
subdirectory of <TT>Lib</TT>. Use (optional) file
<TT>extra-install.list</TT> in that directory to name
additional files to install (see ruby for example).
<dt> <b>Source/Modules/Makefile.am</b>
<dd> Add appropriate files to this Automake file. That's it!
<p>
When you have modified these files, please make sure that the new language module is completely
ignored if it is not installed and detected on a box, that is, <tt>make check-examples</tt> and <tt>make check-test-suite</tt>
politely displays the ignoring language message.
</p>
</dl>
<H3><a name="Extending_nn40">39.10.9 Runtime support</a></H3>
<p>
Discuss the kinds of functions typically needed for SWIG runtime support (e.g.
<tt>SWIG_ConvertPtr()</tt> and <tt>SWIG_NewPointerObj()</tt>) and the names of
the SWIG files that implement those functions.
</p>
<H3><a name="Extending_nn41">39.10.10 Standard library files</a></H3>
<p>
The standard library files that most languages supply keeps growing as SWIG matures.
The following are the minimum that are usually supported:
</p>
<ul>
<li> typemaps.i </li>
<li> std_string.i </li>
<li> std_vector.i </li>
<li> stl.i </li>
</ul>
<p>
Please copy these and modify for any new language.
</p>
<H3><a name="Extending_nn42">39.10.11 User examples</a></H3>
<p>
Each of the language modules provides one or more examples. These examples
are used to demonstrate different features of the language module to SWIG
end-users, but you'll find that they're useful during development and testing
of your language module as well. You can use examples from the existing SWIG
language modules for inspiration.
</p>
<p>
Each example is self-contained and consists of (at least) a <tt>Makefile</tt>,
a SWIG interface file for the example module, and a 'runme' script that demonstrates
the functionality for that module. All of these files are stored in the same
subdirectory under the <tt>Examples/[lang]</tt> directory.
There are two classic examples which should be the first to convert to a new
language module. These are the "simple" C example and the "class" C++ example.
These can be found, for example for Python, in
<tt>Examples/python/simple</tt> and <tt>Examples/python/class</tt>.
</p>
<p>
By default, all of the examples are built and run when the user types
<tt>make check</tt>. To ensure that your examples are automatically run
during this process, see the section on <a href="#Extending_configuration_files">configuration
files</a>.
</p>
<H3><a name="Extending_test_suite">39.10.12 Test driven development and the test-suite</a></H3>
<p>
A test driven development approach is central to the improvement and development of SWIG.
Most modifications to SWIG are accompanied by additional regression tests and checking all
tests to ensure that no regressions have been introduced.
</p>
<p>
The regression testing is carried out by the SWIG <i>test-suite</i>.
The test-suite consists of numerous testcase interface files in the <tt>Examples/test-suite</tt> directory
as well as target language specific runtime tests in the <tt>Examples/test-suite/[lang]</tt> directory.
When a testcase is run, it will execute the following steps for each testcase:
</p>
<ol>
<li>Execute SWIG passing it the testcase interface file.</li>
<li>Compile the resulting generated C/C++ code with either the C or C++ compiler into object files.</li>
<li>Link the object files into a dynamic library (dll/shared object).</li>
<li>Compile any generated and any runtime test target language code with the target language compiler, if the target language supports compilation. This step thus does not apply to the interpreted languages.</li>
<li>Execute a runtime test if one exists.</li>
</ol>
<p>
For example, the <i>ret_by_value</i> testcase consists of two components.
The first component is the <tt>Examples/test-suite/ret_by_value.i</tt> interface file.
The name of the SWIG module <b>must</b> always be the name of the testcase, so the <tt>ret_by_value.i</tt> interface file thus begins with:
</p>
<div class="code">
<pre>
%module ret_by_value
</pre>
</div>
<p>
The testcase code will then follow the module declaration,
usually within a <tt>%inline %{ ... %}</tt> section for the majority of the tests.
</p>
<p>
The second component is the optional runtime tests.
Any runtime tests are named using the following convention: <tt>[testcase]_runme.[ext]</tt>,
where <tt>[testcase]</tt> is the testcase name and <tt>[ext]</tt> is the normal extension for the target language file.
In this case, the Java and Python target languages implement a runtime test, so their files are respectively,
<tt>Examples/test-suite/java/ret_by_value_runme.java</tt> and
<tt>Examples/test-suite/python/ret_by_value_runme.py</tt>.
</p>
<p>
The goal of the test-suite is to test as much as possible in a <b>silent</b> manner.
This way any SWIG or compiler errors or warnings are easily visible.
Should there be any warnings, changes must be made to either fix them (preferably) or suppress them.
Compilation or runtime errors result in a testcase failure and will be immediately visible.
It is therefore essential that the runtime tests are written in a manner that displays nothing to stdout/stderr on success
but error/exception out with an error message on stderr on failure.
</p>
<H4><a name="Extending_running_test_suite">39.10.12.1 Running the test-suite</a></H4>
<p>
In order for the test-suite to work for a particular target language, the language must be correctly detected
and configured during the configure stage so that the correct Makefiles are generated.
Most development occurs on Linux, so usually it is a matter of installing the development packages for the target language
and simply configuring as outlined <a href="#Extending_starting_out">earlier</a>.
</p>
<p>
If when running the test-suite commands that follow, you get a message that the test was skipped, it indicates that the
configure stage is missing information in order to compile and run everything for that language.
</p>
<p>
The test-suite can be run in a number of ways.
The first group of commands are for running multiple testcases in one run and should be executed in the top level directory.
To run the entire test-suite (can take a long time):
</p>
<div class="shell"><pre>
make -k check-test-suite
</pre></div>
<p>
To run the test-suite just for target language [lang], replace [lang] with one of csharp, java, perl5, python, ruby, tcl etc:
</p>
<div class="shell"><pre>
make check-[lang]-test-suite
</pre></div>
<p>
Note that if a runtime test is available, a message "(with run test)" is displayed when run. For example:
</p>
<div class="shell"><pre>
$ make check-python-test-suite
checking python test-suite
checking python testcase argcargvtest (with run test)
checking python testcase python_autodoc
checking python testcase python_append (with run test)
checking python testcase callback (with run test)
</pre></div>
<p>
The files generated on a previous run can be deleted using the clean targets, either the whole test-suite or for a particular language:
</p>
<div class="shell"><pre>
make clean-test-suite
make clean-[lang]-test-suite
</pre></div>
<p>
The test-suite can be run in a <i>partialcheck</i> mode where just SWIG is executed, that is, the compile,
link and running of the testcases is not performed.
Note that the partialcheck does not require the target language to be correctly configured and detected and unlike the other test-suite make targets, is never skipped. Once again, either all the languages can be executed or just a chosen language:
</p>
<div class="shell"><pre>
make partialcheck-test-suite
make partialcheck-[lang]-test-suite
</pre></div>
<p>
If your computer has more than one CPU, you are strongly advised to use parallel make to speed up the execution speed.
This can be done with any of the make targets that execute more than one testcase.
For example, a dual core processor can efficiently use 2 parallel jobs:
</p>
<div class="shell"><pre>
make -j2 check-test-suite
make -j2 check-python-test-suite
make -j2 partialcheck-java-test-suite
</pre></div>
<p>