Lecture Notes for CS 509, Advanced Programming II

Lecture Notes for Advanced Programming II

9 May 2002 - Proposed Additions for Standard C++ '0x

Outline

C++ Standardization
Type Traits
- Type Discrimination
- Type Traits
- Type Trait Example
- The Type-Trait Proposal
Regular Expressions
Hash Tables
- What are Hash Tables?
- STL Hash Tables
Threads
Smart Pointers

C++ Standardization

The ISO JTC1-SC22/WG21 - a group of national standards bodies.
ANSI NCITS/J16 is the American member.
C++ standardization involves three subcommittees: language, library, and performance.
The C++ standard
- Began in the early 1990s.
- Drafts issued in '95 and 6.
- Final draft approved in '97.
- ISO standard issued in '98.
- A three-year period of contemplation followed.
ISO standards must be revisited every five years

Type Traits

All types are not created equal.
- Using STL containers with references.
- Template parameters supporting all necessary operations.
- Different types support different efficiencies.
  - POD can be memcpy()ed; classes must be T::T(const T&)ed.
  - Sorting with random-access vs. bidirectional iterators.

Type Discrimination

Making choices based on type characteristics.
- If P is T &, use template<P> void f(P p) { },
  otherwise use template<P> void f(P & p) { }.
- Does P support clone()?
- If P is a POD type use memcpy(),
  otherwise use P::P(const P&).
- If P is a random-access iterator, use heapsort(),
  otherwise use selectionsort().
Choices can be made at compile-time or at run-time.
Facilities to do this have been implemented over and over.
- Differing facilities of varying complexity.
- A good criterion for standardization.
- SGI, Loki, and Boost offer traits libraries.

Type Traits

A type trait is some characteristic of a type.
- Size in bytes, supports clone(), is a reference.
- May be programmer defined.
Type traits can be made available at run-time.
- Code can use type-trait values to make choices.
Type traits are based on template specialization.
- That mean they break your compiler, probably.

Example - the POD trait.

template <typename T> struct
is_POD { const bool POD_type = false; }

template &typename T> struct
is_POD<bool> { const bool POD_type = true; }

template &typename T> struct
is_POD<char> { const bool POD_type = true; }

// And so on.

is_POD<int> POD_type;

Type Trait Example

Is T void?

template <typename T> 
struct is_void
  static const bool void_type = false

template <> 
struct is_void<void>
  static const bool void_type = true

template <typename T>
static void check_for_void()
  is_void<T> void_check
  if (void_check.void_type)
    std::cerr << "Invalid void parameter type.\n"

template <typename T>
struct needs_nonvoid
  needs_nonvoid()
    check_for_void<T>()

int main()
  needs_nonvoid<int> ok
  needs_nonvoid<void> bad

See the complete code.

The Type-Trait Proposal

The proposal defines three new libraries for unary, relational, and transforming traits.
Unary type traits answer questions about types.
- Defined in <type_traits>.
- Unary type traits include is_pointer, is_POD, and has_trivial_constructor.
Relational type traits compare two types.
- Defined in <type_compare>.
- Relational type traits include is_same and is_convertible.
Transforming type traits change one type into another.
- Defined in <type_transform>.
- Transforming type traits include remove_const, remove_reference, and add_pointer.

Regular Expressions

String processing in C and C++ is weak.
- To some, this is a source of strength: there's nothing in my way.
Recognizing parts of strings is a frequent and ill-supported operation in C++.
- Is this a valid integer? Where's the word in this string? Does this string contain any characters it shouldn't?
Regular expressions are a simple and powerful way of analyzing strings.
- "^[-+]?[0-9]+"$ is a valid integer.
- "[a-z]+" is the word in a string.
- "[^[]fF+-]*$" matches bad characters.

Regular Expression Proposal

Offered up by the Boost group.

Hash Tables

Existing associative containers are implemented with trees.
- O(n log n) insertion and deletion behavior.
- Maintains components in sorted order.
Trees are good for large data sets, but have too much overhead for small data sets.
- The definitions of "large" and "small" vary.
- Sorting also imposes an overhead.
Hash tables have the opposite characteristics.
- They handle small data sets efficiently (in constant time on average).
- In theory, can gracefully scale to large data sets too.
SGI, Rogue Wave, and Metrowerks offer hash table implementations.

What are Hash Tables?

A hash is a function from things to numbers.
- For example, std::accumulate(str.begin, str.end, 0).
- A number is a sequence of around 1 to 16 bytes.
- The more random a hash function, the better.
A hash table is an associative container that stores things keyed by their hash value.
- hash_table[hash_function(thing)] = thing
Hash tables are an old and deeply studied data structure.
- There are many hard and open questions about hash tables.
- Hash tables have many parameters, most of which are critical.
- The parameters are easy to get wrong and hard to get right.
- Good parameter values are dependent on the data set.
- Dyanamic hash tables can solve some of these problems, but with overhead costs and complexity.

STL Hash Tables

The proposal would add hashed associative continers to the STL.
- Map, set, multimap, multiset.
It also defines hash as the generic hash function.
The usual associative table constructors.
- Single element insert and delete have constant average-case running time and itlit(O(n)) worst-case time.
  - Tree insert and delete has O(log n) average- and worst-case time.
Hash table iterators are forward iterators (or perhaps bidirectional).
STL hash tables are dynamic to deal with variable-sized data sets.
- rehash(n) - set the minimum hash table size to n.
- set_max_load_factor(n) - make free space proportional to n.

Threads

Smart Pointers

This page last modified on 9 May 2002.