DOC++. Open Source - Open Science - Open systems



Abstract

DOC++ is a documentation system for C, C++ and Java. It generates TeX output for high quality hardcopies and HTML output for online browsing of the documentation. The documentation is extracted directly from the C/C++ headers and source files or Java class files.
DOC++ is Open Source, which means that is distributed in source form, and, like other open source projects, as Linux, is being developed by several tens of programmers all over the world, coordinated using the Internet. Obviously, the role of the Internet is essential for these open source projects. From a global point of view, the Internet is nothing but a huge open system.
The life cycle of a software project is not ended when releasing some version. The projects evolves, founded bugs are fixed, new features are added on users' demand, and this cycle never ends. The word everything is floating around is cooperation. The main idea of the Open Source model is that if everybody cooperates, everybody wins.
It is very important to stress that the concepts of Open Source and Open System can be found not only in computing world, but in almost every field of activity. From the reality of projects' development following the open source model, it resulted a new concept, the Open Science.

Introduction to DOC++

The idea of DOC++ is to provide a tool that supports the programmer for writing high quality documentation while keeping concetration of the program development. In order to do so, it is important that the programmer can add the documentation right into the source code he developes. Only with such an approach, a programmer would really write some documentation to his classes, methods, etc. and keep them up to date with upcoming changes of code. Hence, the best place where to put documentation are comments.

This is exactly what DOC++ uses for generating documentation. However, there are two types of comments the programmer wants to distiguish. One are comments he does for remembering some implementational issues, while the others are comments for documenting classes, functions etc. such that he or someone else would be able to use the code later on. In DOC++ this distinction is done via different types of comments.
Now, let's consider what "high quality" documentation means. Many programmers like to view the documentation online simply by clicking their mouse buttons. A standard for such documents is HTML, for which good viewers are available on almost every machine. Hence, DOC++ has been designed to produce HTML output in a structured way.
But have you ever printed a HTML page? Doesn't it look ugly, compared to what one is used to? This is not a problem for DOC++ since it also provides TeX output for generating high quality hardcopies.
For both output formats, it is important that the documentation is well structured. DOC++ provides hierarchies, that are reflected as sections / subsections etc., or HTML page hierarchies, respectively. Also an index is generated that allows the user to easily find what he looks for.
As C++ and Java are object-oriented languages, another type of hierarchy is introduced, namely class hierarchies. The best way to read such a hierarchy is by looking at a picture of it. Indeed, DOC++ automatically draws a picture for each class derivation hierarchy or shows it with a Java applet in the HTML output.

Some additional goodies of DOC++ are:

By now, DOC++ has been ported and thoroughly tested on many different platforms, namely: Linux, Solaris, AIX, HP/UX, IRIX, OSF, FreeBSD, Windows'95, '98 and NT.

DOC++

Just like in JavaDoc, the documentation for DOC++ is contained in special versions of Java, C or C++ comments. These are comments with the format:

Note that DOC++ comments are only those with a double asterisk `/**' or `//' respectively. We shall refer to such a comment as a DOC++ comment. Each DOC++ comment is used to specify the documentation for the subsequent declaration (of a variable, class, etc.).
Every DOC++ comment defines a manual entry. A manual entry consists in documentation provided in the DOC++ comment and some information from the subsequent declaration, if available.
Manual entries are structured into various fields. Some of them are automatically filled in by DOC++ while the others may be specified by the documentation writer:

Field name provider description
@type DOC++ depends on source code
@name both depends on source code
@args DOC++ depends on source code
@memo user short documentation
@doc user long documentation
@return user doc of return value of a function
@param user doc of parameter of a function
@exception user doc for exepction thrown by a function
@precondition user doc for preconditions
@postcondition user doc for postconditions
@invariant user doc for invariants
@see user cross reference
@author user author
@version user version

Table 1 - DOC++ manual entries


Except for explicit manual entries, the first three fields will generally be filled automatically by DOC++. How they are filled depends on the category of a manual entry, which is determined by the source code following a DOC++ comment. Generally they contain the entire signature of the subsequent declaration. The following table lists all categories of manual entries and how the fields are filled:

Category @type @name @args
macro #define name [argument list]
variable Type name -
function/method Return type name arguments list [exceptions]
union/enum union/enum name -
class/struct class/struct name [derived classes]
interface interface name [extended interfaces]

Table 2 - How the entries' fields are filled


In any case `@name' contains the name of the declaration to be documented. It will be included in the table of contents.
The remaining fields are filled from the text in the DOC++ comment. Except for the `@doc' and `@memo' field, the text for a field must be preceeded by the field name in the beginning of a line of the DOC++ comment. The subsequent text up to the next occurrence of a field name is used for the field. Field `@name' is an exception in that only the remaining text in the same line is used to fill the field. As an example:

    @author Snoopy

is used to fill the `@author' field with the text ``Snoopy''.
Text that is not preceeded by a field name is used for the `@doc' field. The very first text in a DOC++ comment up to the first occurrence of character `.' is also copied to the `@memo' field. This may be overridden by explicitly specifying a `@memo' field. In this case also characters `.' are allowed.
The `@type', `@args' and `@doc' fields may not be filled explicitly.
DOC++ automatically imposes a hierarchical stucture to the manual entries for classes, structs, unions, enums and interfaces, in that it organizes members of such as sub-entries.
Additionally DOC++ provides means for manually creating subentries to a manual entry. This is done via documentation scopes. A documentation scope is defined using a pair of brackets:

	    //@{
	        ...
	    //@}

just like variable scopes in C, C++ or Java. Instead of ``//@{'' and ``//@}'' one can also use ``/*@{*/'' and ``/*@}*/''. All the manual entries within a documentation scope are organized as subentries of the manual entry preceeding the opening bracket of the scope, but only if this is an explicit manual entry. Otherwise a dummy explicit manual entry is created.
In addition to this, Java allows the programmer to organize classes hierarchically by means of ``packages''. Packages are directly represented in the manual entry hierarchy generated by DOC++. When a DOC++ comment is found before a `package' statement, the documentation is added to the package's manual entry. This functionality as well as documentation scopes are extensions to the features of JavaDoc.
Similar to Java's packages, C++ comes with the ``namespace'' concept. The idea is to group various class, functions, etc. declarations into different universes. DOC++ deals with namespaces in the same way it does with packages.

There is one more special type of comments for DOC++, namely ``//@Include: <files>'' and ``/*@Include: <files>*/''. When any of such comments is parsed, DOC++ will read the specified files in the order they are given. Also wildcards using ``*'' are allowed. It is good practice to use one input file only and include all documented files using such comments, especially when explicit manual entries are used for structuring the documentation. This text is a good example for such a documentation.

DOC++ provides both HTML and TeX output. Both languages have formatting macros which are more or less powerful. The idea of DOC++ is to be able to generate both ouput formats from a single source. Hence, it is not possible to rely on the full functionality of either formatting macros. Instead, DOC++ supports a subset of each set of macros, that has proved to suffice for most applications. However, in one run of DOC++ the user must decide for the formating macros to use. The subset of each macro packet is listed in the following subsections. If one uses only one of the subsets, goodlooking output can be expected for both formats.

A short history

The DOC++ development has begun in 1995, by two germans, Roland Wunderling and Malte Zöckler. After about two years, they stopped the project maintenance. In the early of 1998, I started using DOC++ to document other project of mine, Xterminal. In the middle of 1998, I made some changes to the source code so the HTML output would fit better to my needs. Obviously, I tried to share my changes with others, sending those to the first DOC++ developers. I received no answers, but somehow I managed to get in contact with a guy who made some changes too, namely Michael Meeks. He doesn't had spare time to handle all these changes, so after some couple of more tries to contact the original authors, I took the maintainer job in December. By now there are many people who helps me with the development, seeking bugs, porting on other platforms, and so on.

Open Science

To the last resort, the science is a open source development. Every discovery made by science must be justified. To be justified, the research result must be replicable. Replication is not possible unless the source is shared. It's true that a discovery can follow different paths or can occur in isolation. But the evolution can continue by sharing the informations.
Replication makes scientific results viable. One scientist cannot expect to account for all possible test conditions. By sharing the results to a community of peers, the scientist enables many eyes to see what one pair of eyes might miss. The open sharing of scientific results facilitates discovery. It minimizes duplication of effort because others will know when they are working on similar projects. Progress does not stop because one scientist stops working on a project. If the results are worhty, other scientists will step in and take over the job.
Sharing the source code facilitates creativity. People working on complementary projects can each leverage the results of the other, or combine resources into a single project. One project may become help for another project that would not have been conceived without it.

References

  1. 1. Open Source. Voices from the Open Source Revolution, O'Reilly & Associates, 1999
  2. 2. DOC++. A documentation System for C, C++, IDL and Java, User's manual
  3. 3. Dragos Acostachioaie, Open Systems and Evolution of Computing Industry, Development and Application Systems, Suceava, 1998

Note: this text was published in Proceeding of the 5th International Conference on Development and Application Systems, Suceava, 2000, p. 223 - 226.
You can also read the speech I had.