Incremental Evaluation of OCL Constraints
[Full thesis document]
Intro
A conceptual schema (CS) is the representation of the general knowledge of a domain. In conceptual modeling, we call Information Base (IB) the representation of the state of the CS (the set of existing objects and links) in the system. Integrity constraints play a fundamental role in the definition of CSs. An integrity constraint defines a condition that must always be satisfied by the data in the IB. The software system is in charge of ensuring that the state of the IB is consistent with respect to the integrity constraints.
This process is known as integrity checking. Unfortunately, current MDA, MDD and code-generation methods and tools do not provide adequate integrity checking mechanisms since most of them only admit some predefined types of constraints. Moreover, the few ones supporting a full expressivity in the constraint definition language present a lack of efficiency regarding the verification of the IB (see the survey ).
In this thesis, we propose a new method to deal with the incremental evaluation of the integrity constraints defined in a CS. We consider CSs specified in the UML with constraints defined as OCL invariants. We say that our method is incremental since it adapts some of the ideas of the well-known methods developed for incremental integrity checking in deductive and relational databases. The main goal of incremental methods is to consider as few entities (i.e. objects) of the IB as possible during the evaluation of an integrity constraint. This is achieved in general by reasoning from the structural events (i.e. insert/updates/deletes of objects and inserts/deletes of links) that modify the contents of the IB. Our method is fully automatic and ensures an optimal evaluation of the integrity constraints regardless their concrete syntactic definition.
The main characteristic of our method is that it works at a conceptual level. That is, the result of our method is a standard CS. Thus, the method is not technology-dependent and, in contrast with previous approaches, our results can be used regardless the final technology platform selected to implement the CS. In fact, any code-generation method or tool able to generate code from a CS could be enhanced with our method to automatically generate incremental constraints in any final technology platform, with only minor adaptations. Moreover, the efficiency of the generated constraints is comparable to the efficiency obtained by existing methods for relational and deductive databases.
Up to know, our method has been (partially) adopted by the RoclET and SAP MOIN toolsMethod overview
Our method proposes several improvements to the default checking of the OCL constraints (a direct checking of an OCL constraint would imply 1 - checking the constraint after each modification of the IB and 2 - when checking the constraint, to evaluate the constraint body over all instances of its context type).This is clearly inefficient since it implies a lot of irrelevant verficiations. Roughly speaking, to get an efficient evaluation of each constraint we must ensure that:
- Constraints are only checked after changes on the data that may induce a constraint violation.
- Constraints are syntactically written in the best way (for verification purposes). Sometimes we may need to reshape the original constraint representation provided by the designer into an equivalent alternative best suited to check the IB after the execution of some of the problematic events.
- Constraints are only checked over the "affected" part of the IB (the one that is directly or indirectly modified by the problematic events). Assuming that the IB was in a consistent state prior the modification, the rest of the data will remain consistent. Of course, the computation of the "affected" part is not trivial.
Our methods takes into account these three aspects when proposing the techniques that guarantee an efficient checking of OCL constraints. As an example, given the example CS of Figure 1 including the MaxSalary constraint (stating that the salary of an employee must be lower than the max salary defined in his/her department), our method is able to:
1 - Determining the structural events that may violate a constraint (see Determining the Structural Events That May Violate an Integrity Constraint)
Not all kinds of events may induce a violation of a given constraint. For instance, only the update of salary and maxSalary attributes and the insertion of a new relationship (i.e. link) in WorksIn may violate the previous MaxSalary constraint. Other events, as the update of the name of a department, the removal of an employee or even the insertion of an employee (when the employee is not assigned to any depatment) does not violate the constraint. Hence, we may improve efficiency of integrity checking by discarding the verification of MaxSalary after executing operations that do not include any of these events.
2 - Computing the incremental expression to verify a constraint after executing a given structural event (see Incremental Evaluation of OCL Constraints )
To minimize the number of entities examined when checking a constraint c after an event ev, the tool computes an OCL expression exp that can be used instead of c to verify that the state of the IB is still consistent with respect to c after the execution of ev. The state is consistent iff exp evaluates to true. As an example, the expression exp required to verify MaxSalary after a salary update over an employee e is e.salary<=e.employer.maxSalary. Note that, after this event, we just need to check the relationship between the updated employee (represented by the self variable) and his/her department. Therefore, we avoid verifying all departments (we just access the department where the modified employee works in) and for that department we just compare its salary with the one of the updated employee, thus, discarding the verification of the other employees working in the same department.
3 - Automatic Generation of an efficient CS
With the previous feature alone the designer would be in charge of generating an implementation of the CS that benefits from exp to efficiently verify the constraint. However, our method is also able to modify an initial CS to ensure that all its constraints are efficiently verified.
First, the method determines which are the events that may violate each constraint. Then, for each event, computes an appropriate alternative representation (though semantically-equivalent) of the constraint to be used when verifying the IB after executing events of that kind. Searching for this alternative representation may imply expressing the constraint using a different type as a context type. See Incremental Evaluation of OCL Constraints for the definition of the best context type for a constraint after a certain event and Transforming techniques for OCL Constraints for the definition of how to obtain the new alternative.
Then, we process the CS in the following way (see Computing the Relevant Instances That May Violate an OCL Constraint ) : 1 - The CS is extended with a set of new types that record the events issued during the operation execution, 2 - Each constraint (either one of the original ones or one of the generated alterantives) is redefined to be evaluated only over the instances affected by the events for which that constraint is an appropriate alternative. The redefinition process consists in changing its context type of the constraint to a new derived subtype of the previous context type. These derived subtypes are designed (by means of their derivation rules) to hold only the affected instances. Therefore, after the redefinition, the constraint body is evaluated only over the instances of the derived subtype, i.e. over the affected instances. Figure 2 shows the generated schema for the CS of Figure 1 (Poseidon for UML is used to display the generated XMI) together with the redefined ICs and derivation rules. Note that during the process we have generated a new alternative for MaxSalary (MaxSalary2, defined using employee as a context type and with the body self.salary <= self.employer.maxSalary). This alternative is used to check the consistency of the data in the IB after the events update of the salary of an employee and insert a new link between an employee and a department. The original constraint is only used to check the IB after updates of the maxSalary attribute.
Both alternatives have been redefined over the new derived subtypes EmployeeMaxSalary2 and DepartmentMaxSalary. According to their derivation rules, DepartmentMaxSalary contains those departments where the maxSalary attribute has been modified during the update of the IB while EmployeeMaxSalary2 contains those employees that have been assigned to a department or that have changed the value of their salary attribute. The information about the executed events is recorded in the new types uDepartmentMaxSalary, uEmployeeSalary and iWorksIn.
context DepartmentMaxSalary inv MaxSalary: self.employee->forAll(e: Employee | e.salary <= self.maxSalary)
context DepartmentMaxSalary::allInstances() : Set(Department) body: uDepartmentMaxSalary::allInstances().ref->asSet()
context EmployeeMaxSalary2 inv MaxSalary2: self.salary <= self.employer.maxSalary
context EmployeeMaxSalary2::allInstances() : Set(Employee) body: uEmployeeSalary::allInstances().ref->union(iWorksIn::allInstances().refEmployee)->asSet()
Tool implementation
We have implemented a prototype tool for the efficiency improvements provided by our method. Download this file and follow the instructions on the getting-started.txt file included in the zip. Tool programmers: Carol Cervelló, Raúl Solana and Jordi Cabot.
The tool is implemented as a set of Java classes extended with the libraries of the Dresden OCL toolkit (for parsing and loading the OCL ICs) and NetBeans MDR (for the import/export from XMI files).
Given an XMI file (as this one , representing the schema of Figure 1) and a set of OCL integrity constraints expressed in their concrete (textual) representation (as this one , with the MaxSalary constraint), the tool internally stores the CS and the constraints as instances of the UML and OCL metamodels, respectively.
By means of processing this internal representations, the tool reports about the set of events that may violate each constraint (screenshot 2), the set of constraints an event may violate (screenshot 3), the incremental expressions to be used when verifying each constraint after each type of event (screenshot 4). The tool is also able to generate the XMI corresponding to the efficient schema and a textual file with the new constraints and derivation rules (as shown in Figure 2).