Titanium - Project Ideas

Research project ideas within the Titanium Project

This page provides a listing of project ideas related to the ongoing work with the Titanium project. Many are suitable for semester projects, and some have publication potential. The tasks range from relatively simple, isolated projects that an undergraduate with moderate programming skills could handle, to very complicated projects that could involve sweeping changes to the Titanium compiler and runtime system, which should probably only be attempted by veteran Titanium compiler developers.

Each task is labeled with a brief description and a rough approximation of the difficulty level. Unless otherwise noted, inquiries regarding tasks should be directed to the Titanium Developer mailing list. Also, if you choose to pursue one of these projects, please send mail to that list to keep us informed so we can prevent conflicts. This list will be updated as new projects are thought up and others completed or discarded. If you have an idea for a project related to languages, compilers, optimizers, HPC applications, benchmarking or parallel computation, there's probably a way we can fit it into the Titanium framework. Just ask!

Compiler/Optimizer Language Work

Titanium-to-Java translator (medium) - we'd like an automated way to translate Titanium programs into shared-memory, true-blue Java programs with threads. This would give Titanium users an upgrade path when moving to platforms not currently supported by Titanium, and provide a way to make some performance comparisons between comparable apps written in both languages. Probably best implemented by leveraging the existing Titanium parser and modifying it to dump legal Java source that replaces invocation of Titanium-specific features with calls to Java methods in a new compatibility layer that implements things like barrier, broadcast, etc. using multi-threaded Java code.

Investigate source-to-source compilation issues (medium) - Evaluate the effect of the Titanium's two-stage compilation implementation strategy (namely Titanium-to-C and C-to-target code) on sequential optimization. Are there examples of "standard" optimizations done by C compilers that are done poorly on our compiler-generated C code? Examples to look at are register management, constant folding, elimination of extra temporaries, basic block optimizations. What changes should be made to Titanium-to-C sequential optimization to better exploit optimization of C code? This one has definite publication potential.

Sequential optimizations (medium) - The Titanium optimizer is lacking some important sequential optimizations that should really be added. Namely: global copy propagation, and others. A number of analyses are already present in the optimizer that should facilitate developing these optimizations.

Add optimizations to leverage shared/private inference (medium) - Ben Liblit recently added an analysis and inference system to Titanium that statically infers which data objects are always private (i.e. never accessed by a remote processor). It should now be possible to augment the optimizer to leverage this new information to perform more aggressive sequential optimizations (e.g. instruction reordering and common sub-expression elimination) on private data.
Reconcile the shared/nonshared type qualifiers with shared and private regions (hard) - There's a bit of design work here, and also some good honest compiler hacking. The ultimate goal would be to eliminate the runtime checks that the region code performs whenever a pointer field is updated. We currently need those checks at runtime to ensure that objects in shared regions never contain pointers to private objects. If both systems could be brought together, that invariant could be enforced by the type checker, and therefore most or all the runtime checks could be eliminated. Contact Ben or David for more details.
Non-blocking remote memory read/writes (hard) - The Titanium distributed runtime system is designed to permit weak (non-blocking) remote memory operations. However, these operations are currently only utilized in a few hand-optimized areas of the runtime system. It would be nice to have an analysis that automatically infers which user-written remote memory operations would still be correct when implemented with non-blocking reads and writes, and implement an optimization that translates the appropriate blocking reads/writes to split-phase operations that are less latency-sensitive. The theoretical basis for this analysis has already been established (Arvind Krishnamurthy's thesis), and there's even some code written by Arvind, but we don't have a working implementation in the language yet.
Improve alias analysis (hard) - Theoretical work and a simple implementation of pointer alias analysis has been done by Glen Jeh, but it would be nice to implement a more aggressive analysis
Automatic loop tiling and matrix optimizations (hard) - We'd like to add some optimizations that detect certain matrix computations (either through pattern-matching or with explicit annotations) and restructure the computation for optimal performance on the given architecture. This may include some feedback directed optimization, similar to the FFT-W, Atlas, and Phipack projects. Geoff Pike has done some work on generic loop tiling optimizations, and this work could be extended for specific kernels.
Fix immutables to be mutable - implement value classes (very hard) - The titanium group has drafted a proposal for augmenting the immutable language construct to have mutability properties to support user's needs, without negating the performance advantage of inlined (stack-allocated) objects. Dan Bonachea and Paul Hilfinger have the details.
Extend loop optimizations to Java arrays (very very hard - probably a lost cause) - Titanium includes array indexing optimizations for Titanium arrays. It would be nice to extend these optimizations to also work with Java arrays, but the non-rectangular shape and aribitrary internal aliasing of Java multi-dimensional arrays makes this nearly impossible. It may still be worthwhile on single-dimensional Java arrays.

Language Runtime and Communication System Work

Implement an application event logger and/or tracing GUI (easy) - We'd like a facility to log application events, including major events in the runtime system (such as barriers, broadcasts, and other collective operations) as well as application-defined events (such as the start/end of various phases of computation) to assist in performance debugging. Ideally, this would become part of Titanium's standard library and provide some sort of graphical output to assist programmers. GASNet contains extensive tracing facilities for gathering this information, and even a prototype text-based tool for displaying the information that could help with getting this started.
An alternate approach would be to look at transforming the trace information as required to hook-up a pre-existing parallel performance profiling tool, such as Vampir or Tau.

Implement Region safety checking (medium) - Our language reference currently specs out behavior for ti.lang.Region to provide safe region-based memory management services, by detecting errors (namely deleting a Region while live references remain) and throwing an exception to prevent the unsafe memory deallocation (which enables recovery and debugging). Currently the necessary checks are not fully implemented, leaving a memory safety hole in the implementation. This project involves implementing region reference counts (following established theoretical algorithms) in the region library of the Titanium runtime system, using existing information the compiler already outputs concerning region access behavior.
Implement a debugger for Titanium (medium) - At one point Titanium had a debugger based on gdb, but the code has degraded considerably due to advances in gdb and the Titanium implementation and language. The translator still outputs debugging line and symbol information, but we need a debugger which can read this information and provide source-level debugging for Titanium applications. Ideally, this debugger would support parallel debugging (threads or perhaps even distributed-memory), but even a sequential debugger would be a significant improvement.
Investigate vectorization issues in Titanium (medium) - Titanium runs on vector supercomputer hardware such as the Cray X1, but insufficient work has been done to ensure good automatic vectorization of the generated C code. This project would involve benchmarking the performance of existing Titanium applications and test programs on vector machines, assessing the current impediments to vectorization, and find ways to improve vectorization of the generated code - starting with hand-modification of the generated code and runtime, and eventually finding ways to automate these improvements in the compiler.
Tune garbage collector (medium) - Titanium uses the Boehm-Weiser garbage collector. Recent versions of the collector include a number of advanced features (eg incremental collection, explicit type descriptors, parallel mark and sweep, object finalization, etc) that we're not currently taking advantage of.
Add GASNet support for new high-performance networks (medium) - Titanium has GASNet-based backends for several networks. We'd like to add GASNet support for new high-performance networks/machines, such as IBM Blue Gene, NEC SX, Sun Wildfire and/or other high-performance system-area networks. Contact Dan for more details.
Fix static (medium) - The current semantics of "static" in Titanium correspond to "per-processor" static. The Titanium group has decided to change the semantics of static to mean "per-job".
Titanium arrays over general domains (hard) - Extend Titanium arrays to allow them to have a general (non-rectangular) domain. This would mostly involve library development work.
Experiment with out-of-core support on Titanium arrays (hard) - some research has been done at Rice University about incorporating language support for out-of-core processing into multi-dimensional arrays in KeLP, a C++-based parallel library with some similar goals to Titanium. It may be interesting to modify the Titanium array library to support automatic out-of-core processing, whereby array data (or portions of array data) is automatically moved to and from disk (with possibly some caching) to support the development of concise, out-of-core algorithms in Titanium.

Java/Titanium Library Work

STL-like library using Ti templates (easy) - Titanium has extensive support for templates. It would be nice to use this feature to add some Titanium libraries analogous to the C++ standard template library that provide templatized support for various common data structures and algorithms (e.g. sort arbitrary type, hash table indexed by a provided (possibly non-reference) type, bitsets, type-parameterized sets and multisets, type-parameterized stack, type-parameterized queue, distributed array library, distributed task queue, etc.) Some other libraries would also be useful, for example a getopts-style command-line options parser, a Complex number immutable library, etc.

Implement missing native methods in Java 1.4 library support (medium) - Thanks to recent upgrades, the Titanium language is now a near-superset of Java 1.4, and contains most of the associated Java-level libraries. However, many of the native methods for the less commonly-used Java 1.4 standard library classes remain unimplemented. This involves selecting useful Java 1.4 classes that remain unimplemented and writing the supporting native code methods.

Deep exchange and possibly other deep remote copy operations (medium) - The Titanium exchange operation currently exchanges a "shallow" copy of the supplied array elements. In some cases, a deep copy is desirable

Add more collective communication libraries (medium) - The GASNet communication system provides a number of collective communication operations that are currently not exposed at the language level in Titanium. We should design appropriate interfaces for these operations and add them to the language/library.

Implement a parallel version of the Berger-Rigoutsos algorithm (medium) - The Berger-Rigoutsos algorithm is an edge-detection algorithm that attempts to cover points in a space with rectangular regions, which can then be used to subdivide a problem and structure further parallel computation. This algorithm is useful in various grid-based applications, and currently the only known implementation is purely sequential. It would be nice to have a parallel implementation, written in Titanium or C, available as a library for our applications users. Contact Peter McCorquodale for details. Reference: Berger & Rigoutsos, ``An Algorithm for Point Clustering and Grid Generation'', IEEE Trans. Syst., Man, Cybern., vol. 21, pp. 1278--1286 (1991).
Add a Titanium interface to ChomboVis (medium) - ChomboVis is a program for visualization of 2D and 3D AMR data sets. It would be nice to build a bridge from Titanium to ComboVis so application developers can easily visualize the data sets created by their Titanium programs. This may involve adding some minimal Titanium support for outputting application data to the file system in the HDF5 file format. Talk to Phil Colella for more information.

Misc Tasks

Update the Titanium sample code repository (easy) - The Titanium webpage could use some updates and tweaking to reflect ongoing progress - we especially need a page of Titanium applications and benchmarks (like the private app page, but more organized and tested to make sure it all compiles). There's also a number of applications missing from the private page.

Run some Java test suites on Titanium (easy) - Titanium implements a dialect of Java 1.4, and there are a number of Java conformance test suites available (eg the Jacks test suite from Jikes). We should run some of these suites, filter through the results to find failures in functionality that Titanium claims to support, and report those bugs for fixing.
Ti-doc implementation (medium) - we'd like a version of javadoc that works on titanium source code. We have the source for some old versions of javadoc that could be modified to understand Titanium and dump the relevant info
Implement client-server remote compilation for Titanium (medium) - Titanium supports a wide range of supercomputing systems and architectures. Generally the most difficult issue when porting to a new system is porting the translator component, which translates Titanium into platform-independent ISO C. It's also difficult and labor-intensive to maintain up-to-date installations of the translator across the large number of systems we frequently use. We'd like to add a client-server netcompilation option to Titanium, whereby compilation on remote systems can automatically connect to a centralized translation service (probably using HTTP and/or ssh) and submit their source files for translation, recieving the generated code result for backend compilation. This would allow us to run on several supercomputing systems we currently cannot support because it is impractical or impossible to run the translator on the target machine. A similar netcompile option has been implemented in Berkeley UPC, which we could use as a model.

Back to main page
Last updated Tuesday April 29, 2014
Comments to Titanium developers

Research project ideas within the Titanium Project

Categories:

Compiler/Optimizer Language Work

Language Runtime and Communication System Work

Java/Titanium Library Work

Misc Tasks