PEAR\PHP_UML

Tutorial

Baptiste Autin, December 13th, 2009, updated September, the 7th, 2011

INTRODUCTION

PHP_UML is a PHP parser, an XMI generator and an API documentation tool.
Practically, with PHP_UML, you will be able to feed a UML CASE tool, like Rational Rose or Argouml, with a UML representation of existing PHP source code. This way, you get an instant overview of a PHP application, with all the usual functions of a software design tool (such as class diagrams exportation, refactoring of object-oriented applications, or automatic code generation...)

PHP_UML:

PHP_UML:

PHP_UML generates a logical view (the packages and the classes found), a deployment view (that maps the filesystem that has been scanned), and a component view.

See SOFTWARES_TO_USE_WITH_PHP_UML for an overview of the existing UML softwares.

THE COMMAND LINE TOOL "phpuml"

If you have installed PHP_UML with the PEAR install process, you should be able to use PHP_UML directly from the command line.

Type phpuml -h to get a list of all available commands.

To specify the PHP files/directories to scan, pass them as main arguments.
Eg. phpuml G:\Inetpub parses recursively the directory "G:\Inetpub"
Eg. phpuml index1.php index2.php parses the files "index1.php" and "index2.php"

By default, phpuml will generate the XMI code in version UML 2, and will redirect it to the screen.
To save it to a file, specify the output folder with the option -o
Eg. phpuml G:\Inetpub -o . scans "G:\Inetpub", and saves the XMI code to a file "default.xmi" in the current directory (.)
You can also specify a file name, instead of a directory path: "-o foo.xmi"

Note that you can also use the redirection operator:
Eg. phpuml G:\Inetpub > test.xmi scans "G:\Inetpub", and saves the XMI code to a file "text.xmi"

To get UML/XMI in version 1.x of XMI, use the option -x
Eg. phpuml G:\Inetpub -x 1 -o G:\tmp scans "G:\Inetpub", and saves the XMI code in version 1 to a file "G:\tmp\default.xmi"

With the option -n, you can name your model name (= give a name to the UML root package).
Since PHP_UML saves to a file named after the model name, the following command will save the XMI to a file called "foo.xmi":
Eg. phpuml G:\Inetpub -o . -n foo

In addition to "xmi", 3 output formats are also available: "html", "htmlnew", and "php"

Use the option -f to specify which format phpuml should generate.
Eg. phpuml G:\Inetpub -f html -o G:\Inetpub\api scans recursively the directory "G:\Inetpub", and creates an HTML documentation in "G:\Inetpub\api"

If you need to provide your own XMI file, instead of parsing some existing files, simply pass it as an argument.
Eg. phpuml myFile.xmi -f php -o G:\Inetpub\Foo reads the XMI code contained in "myFile.xmi", and generates the PHP code templates in "G:\Inetpub\Foo"

Note that, if you read an XMI file in UML/XMI version 1.x, the content will be automatically converted to version 2.1
This provides an interesting way to convert XMI files from version 1 to 2:
Eg. phpuml foo1.xmi -o foo2.xmi reads "foo1.xmi", and, if its XMI content is in version 1, converts it to version 2, and stores it in "foo2.xmi"

The option -errorLevel is used to specify the error reporting level (0 for silent process, 1 for the exceptions (default), 2 for the exceptions and all the warnings (in particular, PHP_UML will raise a warning every time it cannot resolve a type)

But instead of using the command-line tool "phpuml", you can also use PHP_UML programmatically, through its public API. The following points explains how you can do that.

OVERVIEW OF THE API

Output formats

In addition to XMI, three output formats are available: "Html" and "HtmlNew" for HTML API documentations, and "Php" for a PHP code generation.
An experimental "Eclipse" format also exists.

Use the helper method export($format, $location) if the default settings suit your needs:
$uml->setInput('/var/www/');
$uml->export('Html', '/var/api/');

...or use the various Exporter implementations: PHP_UML_Output_Html_Exporter, PHP_UML_Output_HtmlNew_Exporter, PHP_UML_Output_Php_Exporter, PHP_UML_Output_Eclipse_Exporter.

Errors

PHP_UML_Warning::$stack;
$stack is an array of (potential) warnings raised during parsing

See also "examples/test_with_api.php" for an example of XMI generation without using the PHP parser.

OPTIONS

$uml->setMatchPatterns($patterns);
$patterns: a string containing one or several file patterns (with the wildcards ? and *).
For example, if you need to parse only the files with the extension .php or .php5, insert this command before starting the parser:
$uml->setMatchPatterns('*.php, *.php5'); By default, PHP_UML parses ONLY *.php files!

$uml->setIgnorePatterns($patterns);
- $patterns: a string containing one or several file patterns (with the wildcards ? and *)
For example, if you need to ignore all directories starting by a dot (like the Subversion folders for instance):
$uml->setIgnorePatterns('.*');

$uml->logicalView = true;
- If true (the default), PHP_UML will include a logical view in the XMI code.
A logical view is what you will probably look for first. It is constituted of all the UML classes, interfaces, packages, methods, attributes... that the parser will have found in your PHP code.

$uml->deploymentView = true;
- If true, PHP_UML will include a "Deployment view" in the XMI code, in addition to the logical view.
In a deployment view, each file is represented by a UML Artifact, each physical folder by a package, and the whole is stored in a package called "Deployment view".
In UML2-aware tools, a "manifestation" should automatically be created between a class and its corresponding source artifact, so that you know in which file a class/interface was defined.

$uml->componentView = true;
- If true, PHP_UML will include a "Component view" in the XMI code.
In UML/XMI version 1, a component view is a set of UML Subsystems (one for each folder)
In UML/XMI version 2, a component view is a set of UML Components:
Each PHP file is represented by a component, linked to the logical classes/interfaces that it contains.
Each physical folder is also a component, and all the elements are nested in each other, like they are in the filesystem.
Note that the structure of this "component view" could change a little bit in the future.
If you have any idea/opinion about what a component view should be, in a reverse-engineering perspective, your feedback is welcomed.

$uml->dollar = true;
- If true, the symbol $ is kept along with the variable names

$uml->docblocks = true;
- If true, docblocks are parsed (@package, @param, @return...)
Note that disabling this option might change a lot the structure of your code, because the information contained in a docblock like @package has a great influence on the structure of the namespaces/packages.
If you disable that option, all the type hints contained in the docblock @param will not be retrieved either.

$uml->onlyAPI = false;
- If true, only the elements (classes, functions, properties) whose docblocks explicitly contain an "@api" are included in the final API. Otherwise, they are ignored.
By default, that option is not set.
Note that, if you need to exclude one particular element of the API, you can also add the docblock "@internal" to its docblocks.

$uml->showInternal = false;
- By default, PHP_UML discards all the elements (classes, functions, properties) with a docblock "@internal".
If you need to have them included, set this option to true.
Note that, in case of a conflict between "showInternal" and "onlyAPI" on a particular element, PHP_UML will skip the element.

$uml->pureObject = false
- Although this is not very UML style, PHP_UML can parse procedural code. The export format "htmlnew" is currently the only format to benefit from this new capability (the XMI format cannot, since it is a strict object-oriented XML vocabulary).
Set this option to true if you want the parser to discard all procedural elements.

PACKAGES AND NAMESPACES

First, a little bit about packages...

In UML, a package is just a container. It contains typed elements, or other packages. Typed elements means datatypes, classes, interfaces...
UML, by itself, does not say how you should name and organize your packages (this is your business) but it is likely that you will want to define them according to logical rules.
For example, in PHP_UML, the classes that build the XMI code are gathered in a package named "XMI", while the PHP parser is put into a different package.
No matter the filesystem, which is a different matter.

From version 5.3, PHP has introduced the concept of namespace, which is intended to avoid name conflicts between class names. The namespace is a very general notion.

In UML, a package is always a namespace for its members.

PHP_UML AND NAMESPACES

Since packages are so important in UML and OO development, PHP_UML tries to use them as much as possible.

First, PHP_UML can read docblock comments.
So you can specify the package of a class by inserting a "@package Foo" in the class comment.
PHP_UML can also read file docblocks, which means that all the classes defined in a file whose main comment contains a "@package Foo" will be considered as belonging to the package "Foo".

Then, PHP_UML interprets the namespacing instructions: namespace and use.
Once the parser has found something like, say, namespace PEAR\PHP\PHP_UML;, it considers every further class as belonging to a package called "PHP_UML", belonging to a package called "PHP", belonging itself to a package called "PEAR".
In short, the hierarchy of the UML packages is matching the hierarchy of the PHP namespaces.

And what about classes that are not docblock-commented and that are not preceded by a "namespace" instruction?
They are simply put into the "default" (or root) package of your UML model. PHP_UML considers that this root package is also the "global namespace" of the PHP language.

Last thing to know about packages...

PHP_UML does not parse the "require" or "include" instructions in the files it scans, and this may lead to unwanted results, especially if you use the docblock @package to define the package of your classes.
Take that class:

/**
 * @package A
 */
Class Foo
{
    function foo(Foobar $x) {
    }
}

And here's Foobar:

/**
 * @package B
 */
Class Foobar() {
}

Since it is defined in package B (mind the docblock!), PHP_UML cannot resolve correctly the declared type Foobar in the signature of the function foo().
... unless foo() is modified like this:

 function foo(\B\Foobar $x) {
 }
 
... but that kind of writing is possible only in namespaced PHP.
So we recommend you to always use the PHP namespace instructions, rather than the docblocks @package.

XMI COMPATIBILITY

Your XMI code might be interpreted differently by the modeling tool you are going to use along with PHP_UML.
This is particularly true for the version 2 of XMI.
For instance, the Eclipse plug-ins (EMF, Papyrus) only accept a particular flavour of XMI, called "ecore", which is only partly compatible with the one you will get with PHP_UML.

PHP_UML does not aim at implementing the whole UML specification, but instead it is designed to reverse-engineer PHP to UML, in a quick and simple way.

Read the file: SOFTWARES_TO_USE_WITH_PHP_UML, for more information about the compatible apps, and to learn how to import your XMI code into Eclipse.