A Gentle Introduction to the Titanium Compiler (tcbuild)

updated 8/2005 by Dan Bonachea

This page describes the basic steps in getting started using the Titanium compiler. The full language reference is available as a separate Postscript document and in PDF form. You should also see the Titanium software documentation page for lots of useful documentation about the language and compiler implementation.

The user interface to the Titanium compiler is called "tcbuild" and it generally works by translated Titanium code into C code (using the "tc" compiler binary), and subsequently automatically compiling the generated C files using the backend C compiler and linking them with the runtime libraries to create a native executable. The compiler works on several different types of machines: uniprocessors, shared memory multiprocessors (SMPs), and distributed memory multiprocessors (which may include loosely-coupled clusters of uniprocessors or clusters of SMP's, as long as more tightly coupled distributed-memory architectures such as the IBM SP or the Cray T3E).

The current compiler is available for a wide range of platforms -- see the Titanium software documentation page for further details.

Here is the comprehensive list of backends available, as of version v3.80:

Portable backends - available on most systems, at some potential cost in communication performance:

Hardware-specific backends - availability varies with hardware, but generally provide the best communication performance:

Invoking the Compiler

The compiler is invoked using the command "tcbuild". On the Berkeley EECS department systems you may find "tcbuild" in one of the following directories:
	/project/cs/titanium/srs/i686-pc-linux-gnu/bin/tcbuild         (stable release)
        /project/cs/titanium/srs/i686-pc-linux-gnu/bin/tcbuild-nightly (nightly snapshot)
	/project/cs/titanium/srs/sparc-sun-solaris2.8/bin/tcbuild      (stable release)

The Titanium developers also maintain installations of the Titanium compiler on a number of off-campus systems, see the Titanium software documentation page for complete info.

You may wish to add the appropriate directory to your search path. Run "tcbuild" with no options for a brief usage synopsis, or with --help for a more complete list of options:

	usage: tcbuild [options] files ...
	       tcbuild --help      to list options
	       tcbuild --settings  to show current configuration
	       tcbuild --version   to show Titanium version

Using Titanium for Running Sequential Jobs

Here is the canonical "Hello, world" program as written in Titanium:
class HelloWorld {
    public static single void main (String [] args) {
        System.out.println("Hello from processor " + Ti.thisProc() + "/" + Ti.numProcs());

Note that any non-trivial parallel program will require some sharing of data between the processes, and will therefore make use of some of the Titanium extensions to Java such as "broadcast" or "exchange". Both of these require that you are are executing within a "single" method, so your "main" will almost certainly be "single", as shown above. See the language reference for more on single and these other features or the (incomplete) tutorial.

In the simplest case, you can go from Java or Titanium sources to an executable with no special options:

	tcbuild HelloWorld.ti
This command will build an executable named "HelloWorld" to run on the sequential backend (always has a single thread of execution). Within the "HelloWorld.ti" file, exactly one of the classes should contain a method called "main," with the usual Java interface for a "main" method, i.e., it must be static, takes an array of arguments, and returns void. The default file extension for Titanium code is ".ti", so the above file could have been named "HelloWorld.ti". The extension ".java" is also accepted for compatibility with Java.

To run this program on a single processor, you now simply run the executable "HelloWorld". Standard input and output work as in Java. Any single-threaded Java program should compile and run as a Titanium program, so a reasonable strategy for getting started with Titanium is to write some Java code or steal some from elsewhere, and compile is as a Titanium program.

Using Titanium on a Shared-Memory Parallel Machine (SMP)

To compile a Titanium program for a shared memory machine, add the flag "--backend smp" to the tcbuild command line. 

To run the resulting code, you need to indicate how many processors the program should use by setting the TI_THREADS environment variable, e.g.:

	setenv TI_THREADS 4

Once this is set, you can simply give the name of the executable to run the program (this would run a 4-thread job as one process containing 4 Titanium processes implemented as pthreads). For maximum performance the number of threads should not exceed the number of idle physical processors, because otherwise threads will compete for CPU resources and can reduce performance. However, for testing and debugging purposes it is sometimes helpful to run with a larger number of threads than there are physical CPU's, for example to simulate a 16-way parallel job on your dual-CPU or single-CPU workstation - this usage is fully supported by Titanium, however as you might expect the performance of running a computationally-expensive application in such a mode may be significantly degraded (because the runtime system will be simulating 16 processors using only 2 real CPU's). 

The Titanium runtime uses fast spin-polling for inter-thread synchronization by default to maximize performance, however these algorithms can lead to pathological thrashing behavior if Titanium processes need to share CPU resources with each other or other processes on the system. In order to deal with this problem, Titanium includes a "polite sync" feature which can be used to enable lower-performance synchronization algorithms that are more CPU-friendly and generally provide better performance in situations where CPU's are overcommitted - you can enable it with:


 The runtime will automatically detect if TI_THREADS exceeds the physical CPU count of the machine and enable this feature (printing a message to the console), however it does not attempt to measure the machine load to account for other system processes. If  you have reason to believe that a significant portion of the machine's physical CPU's may be occupied by other processes, then you should consider setting this variable yourself. 

Note: The instructions above run your smp Titanium job directly on the machine you are logged into. Many production supercomputing facilities require you to run computationally expensive jobs on the compute nodes of the machine  (possibly through the use of a node-reservation batch system), and may prohibit executing such jobs directly on the frontend/login nodes that you typically log into for compilation. Such policies ensure fairness to other users, and also help guarantee your application has dedicated use of the node hardware (otherwise contention from other processes can lead to performance degradation, as explained above). You should consult the site-specific documentation to determine where and how to run your compute jobs - the same instructions above sometimes apply for running a Titanium smp job in a batch script or interactive batch session, allthough on other systems batch scripts also run on frontend nodes, so you may need to invoke a site-specific spawner and tell it to run the Titanium smp job on a single compute node.

Using Titanium on a Distributed-Memory machine (NOW, Millennium, and others)

Titanium has a number of backends which allow applications to be run across a distributed-memory system, through the use of high-performance, low-latency networking protocols and hardware. One feature of the Titanium language is that any program that you've compiled and tested on an SMP should also run correctly on a distributed-memory backend. However, it's important to note that unless you've given some attention to data distribution, the performance may be quite poor due to messaging overhead. For example, on an SMP it is reasonable to have one processor allocate a large shared structure and use "broadcast" to give other processes a reference to that object. However on a distributed-memory backend, you would most likely want each process to allocate one portion of the data structure and share these pieces using the "exchange" operation on Titanium arrays.

The distributed-memory backends available are listed in the first section - all of Titanium's backends aside from sequential and smp are distributed-memory backends. To compile a Titanium program for these backends,  use the appropriate --backend flag to tcbuild (e.g. "--backend mpi-cluster-uniprocess").  The portable mpi-* backends require MPI 1.1 support (which thanks to MPI's popularity  is installed on most clusters designed for high-performance computing). The portable udp-* backends should work on any system that has basic TCP/IP support (however note that on some clusters this might not use the highest performing network hardware available). Often the highest-performance distributed-memory backend to use for your system will be a hardware-specific GASNet backend, which delivers the best performance by bypassing portable network protocols and directly targeting the underlying communication hardware.

The *-cluster-uniprocess and gasnet-*-uni backends run a single Titanium thread on each node of a distributed-memory system. The *-cluster-smp and gasnet-*-smp backends allow running multiple Titanium threads on each node of a distributed-memory system that has multiple CPU's per node, and provides significantly faster communication between threads sharing a node. Note that the mechanisms used for spawning jobs on distributed-memory machine are often very site-specific, so in some cases you may need to consult documentation provided by your site administrator or local Titanium maintainer. If someone has installed Titanium for you, it's likely that the "tcrun" wrapper script can be used to spawn your jobs - see "tcrun --help" for details, but here's how you might run an 8-node *-cluster-uniprocess or  gasnet-*-uni backend job using tcrun:

tcrun -n 8 HelloWorld

The distributed-memory Titanium runtime system will output information at startup describing the job layout, for example:

Tic: c19 is OS process 1 of 8 (Ti proc 1 of 8)
Tic: c18 is OS process 0 of 8 (Ti proc 0 of 8)
Tic: c22 is OS process 4 of 8 (Ti proc 4 of 8)
Tic: c20 is OS process 2 of 8 (Ti proc 2 of 8)
Tic: c23 is OS process 5 of 8 (Ti proc 5 of 8)
Tic: c21 is OS process 3 of 8 (Ti proc 3 of 8)
Tic: c25 is OS process 7 of 8 (Ti proc 7 of 8)
Tic: c24 is OS process 6 of 8 (Ti proc 6 of 8)
Hello, world
Hello, world
Hello, world

Note there is one line per distributed-memory process, which includes the hostname for the nodes being used and the corresponding Titanium process number(s). You should pay attention to these headers to ensure you got the desired layout of Titanium processes across physical nodes - if the job spawn fails or gives you an undesirable job layout, you should try adding the -v or -t options to the tcrun line, which will display more detailed information about the actions taken by tcrun (which you can use as a template to copy & edit appropriately).

Consult the MPI backend usage docs and UDP backend usage docs for further information and troubleshooting about how to run programs using the mpi-* and udp-* backends. Consult the GASNet webpage for details on running programs on the gasnet-* backends and information about setting system-specific tuning parameters for communication behavior.

All *-cluster-smp and gasnet-*-smp backends require you to set the environment variable TI_THREADS to indicate the thread layout across the nodes of your system before running your application. For example:

setenv TI_THREADS "4 4 2 2"
tcrun -n 4 HelloWorld

would configure the environment for a 12-thread job, run across 4-nodes of the system (2 quad processors and 2 dual processors), and produce output something like this:

Tic: c18 is OS process 0 of 4 (Ti procs 0..3 of 12)
Tic: c19 is OS process 1 of 4 (Ti procs 4..7 of 12)
Tic: c20 is OS process 2 of 4 (Ti procs 8..9 of 12)
Tic: c21 is OS process 3 of 4 (Ti procs 10..11 of 12)
Hello, world
Hello, world
Hello, world

Note again there is one line per distributed-memory process, although in this case each compute node process hosts several Titanium processes (implemented as pthreads), as selected by TI_THREADS.

Other Options with the Titanium Compiler 

The "tcbuild --help" option lists the available configuration switches, with a brief description of each. If you wish to change any aspect of the Titanium build process, start by looking at "tcbuild --help". Several more advanced tweaks are documented in "tcbuild --help-tcflags".

Here is a snapshot of the tcbuild --help output, as of v3.80:

usage: tcbuild [options] files ...
--D <key>[=<value>] / --U<key> Define or undefine user-provided preprocessor macro during Titanium and C preprocessing.
--E Preprocess only, and dump result to stdout. Implies --silent
--backend <back> Use multiprocessing backend <back>. Available backends for this host are: ... (differs by host)
--bcheck Use array bounds checking. To turn off bounds checking, use "--nobcheck".
--cache-dir <dir> Cache object files in directory <dir>.  To disable caching, simply omit the <dir> parameter.
--cc <prog> Use C compiler <prog>.
--cc-debug-flags <flags> Use <flags> on the C compiler command line for debug builds.
--cc-flags <flags> Add user-provided <flags> to the C compiler command line.
--cc-opt-flags <flags> Use <flags> on the C compiler command line for optimized builds.
--cc-system-flags <flags> System-provided <flags> for the C compiler command line.
--classlib-post <paths> Add colon-delimited <paths> to the end of the source file search path.
--classlib-pre <paths> Add colon-delimited <paths> to the start of the source file search path.
--debug Compile debugging information into object files.  May be abbreviated as "-g".  To omit debugging information, use "--nodebug".
--fail-keep Retain intermediate files after a C compilation failure, regardless of --keep settings.
--gdb Run tc under gdb (implies verbose)
--generate-dir <dir> Store generated C source files in <dir>.
--help Describe all available command-line options.
--help-envvars Describe environment variables affecting Titanium application execution.
--help-tcflags Describe --tcflags command-line options.
--keep-generate-dir Retain intermediate gendir after compilation. To delete intermediate gendir, use "--nokeep-generate-dir".
--keep-intermediate Synonym for "--keep-generate-dir --keep-object-dir". May also be abbreviated "--keep", or negated "--nokeep"
--keep-object-dir Retain intermediate objdir after compilation. Ignored when gendir == objdir.
--ld <prog> Use C linker <prog>.
--ld-flags <flags> Add <flags> to the C linker command line.
--ld-libs <libs> Append <libs> to the C linker command line.
--library Build a library.  To build an executable, use "--nolibrary".
--main <classname> Use <classname>.main as the main method.
--make <prog> Compile using make engine <prog>.
--make-flags <flags> Add <flags> to the make engine link line.
--object-dir <dir> Store object files compiled from generated C
in <dir>. Defaults to generate-dir.
--optimize Optimize both Titanium as well as generated C code.  May be abbreviated as "-O".  To skip optimization, use "--nooptimize".
--outfile <name> Store finished Titanium program as <name>.
--profile Build an executable that produces gprof output. May be abbreviated as "-pg".
--rawmsg Suppress demangling of compilation progress messages
--rebuild-tlib Rebuild Titanium class library.  To use the prebuilt library, use "--norebuild-tlib".
--script <name> Run tc and output a shell script (<name>) that will perform C compilation and linking (implies --keep-generate-dir).
--sequential-consistency Use a sequentially consistent memory model. To use the standard, weaker memory model, use "--nosequential-consistency".
--settings Show the current default option settings and exit immediately.
--silent Work silently.  Only error messages will be printed.
--skip-tc Skip tc compilation step if possible (dangerous - for advanced users only!)
--stats Display statistics about the compilation.
--stoptifu Optimize with Stoptifu.
--stoptifu-stats Display statistics about Stoptifu transformations.
--tc <prog> Use Titanium compiler <prog>.
--tc-flags <flags> Add <flags> to the Titanium compiler command line. See --help-tcflags for more info.
--verbose Show the individual commands used for each phase of compilation.
--version Show the Titanium version number and exit immediately.

Options may be abbreviated to the shortest unambiguous prefix. The environment variable $TCBUILD_FLAGS is processed first, and may be used to set preferred defaults. You can add --settings to any tcbuild command line to see the state of all the options that will be used. For example, if you're a developer interested in seeing or playing with Titanium's generated C code, the following may be a useful setting to add to your .cshrc:

	setenv TCBUILD_FLAGS "--keep-intermediate --generate-dir ./tc-gen"

The most interesting and useful options above for general compilations are probably those related to debugging and optimization. For a debugging build of your application, specify a command line like this:

tcbuild -g HelloWorld.ti

When configuring for maximum performance, you most likely want a command line like this: 

tcbuild -O --nobcheck HelloWorld.ti

(--nobcheck turns off bounds checking which makes many programs run faster, but may be dangerous if your program contains bugs such as reading past the end of an array. This option shouldn't be used until the program is well tested and debugged)

One subtle option is "--make-flags". This is a list of flags to pass down to the make job that compiles and links the generated C/C++ code. This is provided primarily to speedup the Titanium backend C compilation processing using parallelism. For example, on an SMP, try this to speed things up:

	tcbuild --make-flags -j8 HelloWorld.ti
If you find yourself using the same set of options again and again, put them in the ${TCBUILD_FLAGS} environment variable, and they will be picked up automatically each time your run "tcbuild". For tcsh users, here is a useful command to add to your ~/.tcshrc file, which provides intelligent command-line tab-completion for tcbuild command lines (saves you from typing long backend names).

Warnings, Caveats, and Bugs

To report bugs on the Titanium system (language, compiler, debugger, or benchmarks), use the Titanium bug database (on-campus access only) or email the Titanium developer list.

The Titanium compiler generates a large number of small temporary files during compilation - on some systems with strict quota limits this may cause problems for compiling large applications or using the --rebuild-tlib option. One way to deal with this is perform your compilations in a temp directory or on a scratch file system with less restrictive quotas (this is often a good idea for compilation performance reasons as well).

For large programs, you are permitted to pass multiple source files to tcbuild, although in general it is sufficient to pass the name of the source file containing the main method and Titanium will automatically the other relevant source files from the current directory and classpath, according to the same rules used by javac. In general you should avoid using source files whose names differ only in their suffix (".ti" or ".java") - these can cause "ambiguous reference" errors because the compiler won't know which files to use for resolving imported classes. Also, you should avoid source files whose names differ only in case, as this is not portable to all file systems.

Getting "tcbuild" to work involves a large number of components, including a Perl script, an executable, a makefile, and several shell scripts. With so many moving parts, it is easy for something to break - luckily most of the bugs have been worked out by now, but if "tcbuild" fails for you, it would be helpful to the compiler developers to see such items as:

The contents of this page were a combined effort of Ben Liblit, Kathy Yelick and Dan Bonachea. Send all complaints to the Titanium developer list.
Last updated Tuesday April 29, 2014