MDD meets ABAP

MDD is a hot topic for some time now. With the arrival of powerful tools like EMF an oAW, it's become possible to actually use the benefits of model-driven development without spending too much time and money on the tools. MDD has long reached it's break-even point between intellectual investment and development speed-up for certain types of projects. It's not the solution to all problems, but it's a really powerful technique with a set of stable and proven tools available.

SAP R/3 systems are well established - mostly recognized as ERP or CRM systems, but actually capable to do much more. Unknown to many outside the R/3 world, every R/3 system comes with a full-fledged development environment. There's an almost traditional rift between "old-school" R/3-developers and "modern" non-R/3-developers, and this might heavily contribute to the fact that up to now, there seems to be no real contact between the world of MDD and R/3 - at least not in the sense of "toolkits for pushing developer productivity". SAP itself relies heavily on modeling and generation techniques, but little of this effort is available as a tool for the average developer. Why is it so difficult to get the "best of both worlds" - is it only a matter of missing interest and information, or are there more serious problems to be solved? Come with me and see if and how model-driven development practices can be applied to SAP R/3 development and what issues have to be solved on the way.

This article is based on the contents of my talk at the Special Interest Group Model Driven Software Engineering (SIG-MDSE) in May 2008.

In an average MDD project in a Java environment, we use generative technologies to produce certain artifacts automatically. This will usually be source code files for classes and interfaces, control and description files (for example, XML files to drive Ant builds or control serialization) or some language-dependent properties files, all nicely arranged in folders representing the package structure. In other words, the result of the generation steps are mostly text files, and the name, extension and content of the file have to be taken into account when determining what to do with the file. The role of file named MANIFEST.MF for example can be determined by its name only, but you have to take a look at the contents of FooBar.java to find out whether there's a class FooBar or an interface FooBar inside. For code generation, this is an advantage - you can use the very same tool set to generate all of the files involved.

Things are different in a SAP R/3 system. The Matrix already taught us that "there is no spoon", and similarly, in a R/3 system, there are no files. Classes and Interfaces are defined and described in database tables, attributes and methods are stored in tables referring to these tables, and method parameters - you get the idea. Even the source code of the methods is stored in database tables. Take a look at the following set of screenshots to get an impression of how to define and implement a class in the R/3 development environment.

ABAP class definition and method implementation
ABAP class definition and method implementation

As an additional minor complication, there are parts of ABAP that do not support object orientation - for screen-based interaction, for example, you still need procedural programs that have a different internal structure. Even worse: There are lots of artifacts that have no resemblance to coding at all, but need to be generated nonetheless very frequently: data structures (both for transient use and persistent storage) and the elements the structures consist of, the dialog screens, language-dependent texts, lock and logging objects and so on. Many of these artifact types have a complex inner structure and refer to other objects, and they are essential for development. It's not possible to omit these objects because they are as well part of the application as a class or an interface.

You might have noticed that we just ran into another complication - low-level dependency management. If you didn't notice it, there's no need to be embarrassed because this is a non-issue in "traditional" MDD projects. Take the following UML diagram as an example:

UML example (Java)
UML example (Java)

In a Java environment, we have to generate three text files and place them in the appropriate directory, then the Java compiler will take care of the rest. We can generate the files in any order you like because compilation will only start after the generation is completed. You might have guessed: In a R/3 system, that's different. First, we have to distinguish between creation and activation of an object. I don't want to take you too deep into the internals of the R/3 IDE - for now, it's enough to know that objects can be created and saved in an "inactive" state but cannot be used until they are active. Inactive objects may be incomplete and even contain syntax errors - which have to be removed before activation, of course. Once an active object is modified, the active version is kept in place and an additional inactive version is created, and when that version is activated, it replaces the previous active version.

When creating artifacts that refer to other objects, there are two important rules:

  • Objects used in the definition must exist in an active state when the definition is saved. (Example: data types of parameters in a method definition.)
  • Objects used in an implementation must exist in an active state when the referring object is activated. (Example: data types used in a method implementation.)

Let's apply these rules now, taking the following UML diagram as an example:

UML example (ABAP)
UML example (ABAP)


As you can see, CL_MY_SECOND_CLASS implements IF_MY_INTERFACE - this relation is part of the definition of the class, so IF_MY_INTERFACE has to exist before we can even save the class. (Actually, for a human developer, that's not true - you can of course add interface implementations to existing classes. For a program generator, that's not possible - we'll talk about that later on.) IF_MY_INTERFACE defines a method with a parameter that is typed as a reference to CL_MY_FIRST_CLASS, so again, CL_MY_FIRST_CLASS has to exist before we can save the interface. This leads to the following mandatory order of generation:

  1. CL_MY_FIRST_CLASS
  2. IF_MY_INTERFACE
  3. CL_MY_SECOND_CLASS

If we don't stick to this order, generation will fail - and we haven't generated a single line of code so far...

For small target applications, it's theoretically possible to manually define the order of generation, but for larger applications with a varying number and structure of objects, that's no longer an option. On the other hand, it's not advisable to "pollute" the generator or the templates with instructions about the order in which the objects have to be generated. A feasible solution is to perform a two-step generation: First generate a transient target model that contains the objects that need to be generated in a structure that matches the R/3 system as close as desired. The contents of this intermediate model can be generated in any order, and from the contents, many intrinsic low-level dependencies can already be deduced. Other dependencies can be added during the model-to-model-transformation if required. In a second step, the contents of the intermediate model can be arranged as a dependency tree (which should then be a set of DAGs - otherwise you're in trouble...) and processed in order (depth-first).

This approach also neatly eliminates the artifact type issue I described above - it's possible to combine all these structures and object types in the meta-model of the intermediate model and include them in the process of generation. It's also a good idea to dump this intermediate model to a file - it's much easier and faster to debug the generation process if you can take a direct look at its results. Now we can also use our beloved standard MDD tools to perform the task, although we need a slight deviation for the source code generation: first we generate an intermediate model (using Xtend, for example) that contains everything but the source code, then we generate the source code into separate text files (using Xpand, for example) and only then we merge the two by stuffing the method code into multi-line string properties of the intermediate model. This is an example of how such an intermediate model might look like:

Intermediate Repository Object Model
Intermediate Repository Object Model

By the way, we just skipped an essential problem - where do we get the names of our artifacts from? Again, this is not an issue for Java projects, mainly because you can have as many classes named Foo or Bar as you like, as long as they are located in different packages. The R/3 system also contains something named "package", but in contrast to most other development environments, packages in ABAP do not provide separate name spaces. The central object repository of the development environment is a flat database of all objects, requiring that object names are unique within the entire system - no two classes CL_FOO can exist at the same time. With a maximum of 30 characters for the class name and only 16 characters for table names, inventing the artifact names becomes a real issue - but one that has to be solved for each project separately.

Let's return to our fancy intermediate model. Now we have a fairly exact picture of what we want to create and even know in which order we have to create it. What's missing is a consistent way to actually make our dreams come true. There are virtually no official interfaces that allow write access to the R/3 development environment - heck, even the interfaces for reading come in homeopathic dosages only. There are quite a few very good reasons for this, but the bottom line is that we just have to accept this fact. There's no official support to generate most of the object types, and the few interfaces that exist are even broken in some places.

Now it is possible to roll your own interfaces. It's tedious, it's even dangerous (you can ruin the entire system if you don't know where you're poking around), but it's possible. I've done it, and there's a good reason I'm not putting these interfaces on-line (besides the IP issues) - if something goes wrong and a system goes down, there's my name on the program that broke it. But seriously - to actively maintain and support a set of interfaces would require extensive testing (read: manpower) and a landscape of multiple systems with multiple R/3 versions (read: financial power), and I've got neither. I'm currently evaluating a second way - transforming the intermediate model into a file that SAPlink can import. Due to the way SAPlink files are structured, it's quite a challenge, and I don't have anything presentable yet.

Please drop me a note if you're interested in this topic - I'd love to discuss some ideas and would greatly appreciate any comment.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer