[DCS Homepage]
[Grad Homepage]
[Search]
[..Next]
[Prev..]
[Department]
[Faculty]
[Applications]
[Grad Program]
[Financial Assist.]
[Grad Courses]
[Theses]
[Administration]
M.Sc. Theses:
[1998-99]
[1999-2000]
[2000-01]
[2001-02]
Ph.D. Theses:
[1998-99]
[ 1999-2000 ]
[2000-01]
[2001-02]
Ph.D. Theses 1999-2000
This page lists theses' titles and abstracts of recent Ph.D. graduates
of the department, starting from July 1, 1999. Stay tuned for some
wonderful graduates!
|
Name: | Luk, Chi-Keung
|
|
Supervisor(s): | Mowry, Todd
|
|
Degree: | Ph.D.
|
|
Date: | January 20, 2000
|
|
Title: | Optimizing the Cache Performance of Non-Numeric Applications
|
Abstract:
The latency of accessing instructions and data from the memory
subsystem is an increasingly crucial performance bottleneck in modern
computer systems. While cache hierarchies are an important first step,
they alone cannot solve the problem. Further, though a variety of
latency-hiding techniques have been proposed, their success has been
largely limited to regular, numeric applications. Few promising
latency-hiding techniques that can handle irregular, non-numeric codes
have been proposed, in spite of the popularity of such codes in
computer applications.
This dissertation investigates hardware and software techniques for
coping with the instruction-access latency and data-access
latency in non-numeric applications. To deal with
instruction-access latency, we propose cooperative instruction
prefetching, a novel technique which significantly outperforms
state-of-the-art instruction prefetching schemes by being able to
prefetch more aggressively and much further ahead of time while at the
same time substantially reducing the amount of useless prefetches.
To cope with data-access latency, we investigate three complementary
techniques. First, we study how to use compiler-inserted data
prefetching to tolerate the latency of accessing pointer-based data
structures. To schedule prefetches early enough, we design three
prefetching schemes to overcome the pointer-chasing problem associated
with these data structures, and we automate them in an optimizing
research compiler. Second, we study how to safely perform an important
class of locality optimizations, namely dynamic data layout
optimizations, in non-numeric codes. Specifically, we propose the use
of an architectural mechanism called memory forwarding which can
guarantee the safety of data relocation, thereby enabling many
aggressive data layout optimizations (which also facilitate
prefetching) that cannot be safely performed using current hardware or
compiler technology. Finally, in an effort to minimize the overheads
of latency tolerance techniques, we propose new cache miss prediction
techniques based on correlation profiling. By correlating cache
miss behaviors with dynamic execution contexts, these techniques can
accurately isolate dynamic miss instances and so pay the latency
tolerance overhead only when there would have been cache misses.
Detailed design considerations and experimental evaluations are
provided for our proposed techniques, confirming them as viable
solutions for coping with memory latency in non-numeric applications.
Key words and phrases: Cache performance, non-numeric
applications, tolerating latency, instruction prefetching, data
prefetching, locality optimizations, cache miss prediction.
|
Name: | Ruppert, Eric
|
|
Supervisor(s): | Fich, Faith
|
|
Degree: | Ph.D.
|
|
Date: | December, 1999
|
|
Title: | The Consensus Power of Shared-Memory Distributed Systems
|
Abstract:
In many asynchronous distributed systems, processes communicate
by accessing objects in a shared memory. The ability of
systems to solve problems in a fault-tolerant manner depends
on the types of objects provided. Here, the wait-free model of
fault-tolerance is used: non-faulty processes must run
correctly even if other processes experience halting failures.
The consensus problem, where processes begin with private
inputs and must agree on one of them, has played a central
role in analysing the power of distributed systems. This
thesis studies the ability of different types of objects to
solve consensus.
An object type has consensus number n if it can be
used (with read/write registers) to solve consensus
among n processes but not among n+1 processes.
Conditions are given that are necessary and sufficient for an
object type to have consensus number n. This
characterization applies to two large classes of objects:
readable objects and read-modify-write (RMW) objects. An
object is readable if processes can read its state without
changing the state. For a RMW object, all operations update
the state and then return the previous state of the object.
When the type is of bounded size, the characterization may
be used to decide the question ``Does the type T
have consensus number n?'', which is undecidable for
arbitrary types. The characterization is also used to show
that different readable and RMW types with consensus number
n cannot be used in combination to solve consensus
for n+1 processes.
Ordinarily, processes may access only one object in shared
memory at a time. This thesis also studies how much the
consensus number of a type increases in the multi-object and
transactional models, where processes can perform operations
on up to m of the objects in a single atomic action.
These models are much more convenient for programmers to use,
since they guarantee that certain blocks of operations will
be executed without interruptions from other processes. This
thesis establishes bounds on the consensus numbers of
multi-objects and transactional objects as a function of
m and the consensus numbers of the corresponding
single-access types.
|
Name: | Dissett, Luis
|
|
Supervisor(s): | Mathon, Rudolf
|
|
Degree: | Ph.D.
|
|
Date: | November, 1999
|
|
Title: | Combinatorial and computational aspects of finite geometries
|
Abstract:
This thesis presents a methodology for the systematic generation of
strongly regular graphs by fusing classes in large association
schemes; this methodology includes:
-
an algorithm for finding the eigenvalue matrix of an association scheme;
-
an algorithm for determining the lattice of matrices determined by
subgroups of the automorphism group of an association scheme;
-
an integer programming formulation that allows us to determine quickly
whether a matrix (obtained by fusing the columns of the eigenvalue matrix
of an association scheme according to some group) will yield strongly
regular graphs with a given parameter set;
-
an exhaustive search algorithm that, given the parameters of a graph and
a matrix, will find all the combinations of columns of the matrix that
correspond to strongly regular graphs with the given parameters.
Using the methodology presented here, we have settled the existence
problem for strongly regular graphs with the following parameters:
625,156,29,42;
729,308,127,132;
729,336,153,156;
1024,231,38,56;
1024,297,76,90;
1024,330,98,110;
1024,363,122,132;
1024,396,148,156;
1024,429,176,182;
1024,462,206,210.
We have also found new graphs for many parameter sets for which some
graphs were already known, as well as graphs which were themselves
known.
Many of the graphs found are pseudo-geometric, i.e., their parameters
correspond to those of the point graph of some hypothetical finite
geometry. Some of the graphs are actually geometric, and it is
possible to recover the geometry from the maximum cliques of the
graph.
The graphs presented here are all of cyclotomic type; therefore they
all can be expressed as two-intersection sets in suitable projective
geometries, and they all determine two-weight codes. However, the
methodology is not restricted to this kind of graphs.
We have found six interesting non-isomorphic strongly regular graphs
with parameters (625, 156, 29, 42). If we consider the corresponding
two-intersection sets in PG(3, 5), the affine lines determined by each
of these sets form a geometric structure that resembles a semipartial
geometry: given any antiflag (p, L), the number of points in L
which are collinear with p is either 0, 1 or 2 (i.e., this number
takes one of three values, while in the case of partial geometries it
takes only one value and in the case of semipartial geometries it takes
one of two values).
Another interesting property of these graphs is that it is possible to
factorize the complete graph K_{625} into copies of some of these
graphs; we exhibit three such factorizations.
|
Name: | Lu, Chien-Ping Paul
|
|
Supervisor(s): | Zhou, Songnian
|
|
Degree: | Ph.D.
|
|
Date: | November 1999
|
|
Title: | Scoped Behaviour for Optimized Distributed Data Sharing
|
Abstract:
We introduce the novel scoped behaviour abstraction and examine how
it is used to optimize distributed data-sharing patterns within the
Aurora parallel programming system. Scoped behaviour is an application
programmer's interface to a set of system-provided optimizations; it
is also an implementation framework for the optimizations. Aurora is a
distributed shared data system where shared-data objects are implemented
in C++ as abstract data types. Aurora has been prototyped on a network
of workstations connected by an ATM network.
The design, implementation, and evaluation of Aurora and scoped
behaviour contributes to the field of parallel and distributed systems
by demonstrating that:
1. Scoped behaviour can provide per-object and
per-context (i.e., specific portion of the source code)
flexibility when applying data-sharing optimizations. In contrast
to some other systems, Aurora programs can be incrementally
tuned and only a minimum number of error-prone changes to the
source code are required in order to experiment with different
optimization strategies.
2. Scoped behaviour can be implemented without language extensions
and without special compiler support. Scoped behaviour's novel
implementation framework can exploit both compile-time and
run-time information about the parallel program.
3. A parallel programming system based on a high-level
shared-data abstraction can achieve high performance. In a
performance evaluation of four applications implemented
using three different types of parallel programming systems,
Aurora usually matches or outperforms, sometimes by a wide
margin, TreadMarks (a distributed shared memory system) and a
message-passing system (either MPICH or PVM, depending on the
application).
|
Name: | Edmonds, Philip
|
|
Supervisor(s): | Hirst, Graeme
|
|
Degree: | Ph.D.
|
|
Date: | October 1999
|
|
Title: | Semantic Representations of Near-Synonyms for Automatic Lexical Choice
|
Abstract:
We develop a new computational model for representing the fine-grained
meanings of near-synonyms and the differences between them. We also
develop a sophisticated lexical-choice process that can decide which
of several near-synonyms is most appropriate in any particular
context. This research has direct applications in machine translation
and text generation, and also in intelligent electronic dictionaries
and automated style-checking and document editing.
We first identify the problems of representing near-synonyms in a
computational lexicon and show that no previous model adequately
accounts for near-synonymy. We then propose a preliminary theory to
account for near-synonymy in which the meaning of a word arises out of
a context-dependent combination of a context-independent core meaning
and a set of explicit differences to its near-synonyms. That is,
near-synonyms cluster together.
After considering a statistical model and its weaknesses, we develop a
clustered model of lexical knowledge, based on the conventional
ontological model. The model cuts off the ontology at a coarse grain,
thus avoiding an awkward proliferation of language-dependent concepts
in the ontology, and groups near-synonyms into subconceptual clusters
that are linked to the ontology. A cluster acts as a formal
usage note that differentiates near-synonyms in terms of
fine-grained aspects of denotation, implication, expressed attitude,
and style. The model is general enough to account for other types of
variation, for instance, in collocational behaviour.
We formalize various criteria for lexical choice as preferences
to express certain concepts with varying indirectness, to express
attitudes, and to establish certain styles. The lexical-choice
process chooses the near-synonym that best satisfies the most
preferences. The process uses an approximate-matching algorithm that
determines how well the set of lexical distinctions of each
near-synonym in a cluster matches a set of input preferences.
We implemented the lexical-choice process in a prototype
sentence-planning system. We evaluate the system to show that it can
make the appropriate word choices when given a set of preferences.
|
Name: | Mac Donald, Craig
|
|
Supervisor(s): | Enright, Wayne
|
|
Degree: | Ph.D.
|
|
Date: | October 28, 1999
|
|
Title: | A New Approach for DAEs
|
Abstract:
This thesis introduces a new 'least squares' implementation of a continuous
extension of an underlying discrete Runge-Kutta (RK) formula that is
particularly effective for solving DAEs. The resulting method provides a
continuous approximation to the solution of the DAE and does not assume that
the problem has a special form or that the index of the problem is known.
The methods we develop and implement include some that are based on underlying
discrete formulas that were previously not considered suitable for DAEs
because of severe order reduction or stability restrictions. Specifically we
propose a particular implementation of a modified extension of these
formulas and develop stability and convergence results which indicate that
these methods can be effective when solving DAEs. Numerical results are
included to verify this.
We also investigate and justify an error control and stepsize choosing
strategy that is based on directly monitoring and bounding the continuous
residual (or defect) of the numerical solution. We investigate the
relationship between the size of the residual and the local error for index
one and index two DAEs. We also derive the extra order conditions that arise
for special classes of index one and index two problems.
Our fixed stepsize and variable stepsize numerical results confirm the
improved order and stability that is achievable with our approach. These
results also seem to indicate that our approach is less likely to run into
convergence difficulties (with nonlinear equations that must be solved on each
timestep) than is observed with traditional implementations.
If you have any thoughts, suggestions or queries, please feel free
to contact us at www@cs.toronto.edu
Last updated March 2, 2000