Object-Resource Binding Guideline
Latest version:http://dfdf.inesc-id.pt/tr/orb
This version:http://dfdf.inesc-id.pt/tr/doc/orb/20071122
Editors
- Xiaoshu
Wang (xiao
kdbio.inesc-id.pt) - Jonas S. Almeida
(jalmeida
mdanderson.org) - Arlindo L. Oliveira
(aml
inesc-id.pt)
Abstract
RDF is a graph-structured logic language that is used in the semantic web to represent knowledge. To put the knowledge in use, however, requires the RDF structures be bound with programming constructs so that appropriate software tools can be used to run certain tasks under suitable circumstances. This document describes the difference between the two modeling paradigm and discussed the basic principles that may be used to guide the design of an object-resource binding framework.
As all other documents presented in this site, terminologies, special notations and syntax are described in DFDF terms.
1.1. Open vs. closed world assumptions
1.2. Declarative vs. procedural
2. How to Approach Object-Resource Binding
2.1. Implicit vs. explicit approach
1. OOAD vs. RDF Modeling
Despite their many shared terminologies, such as class and propertyetc, Resource Description Framework (RDF) and Object Oriented Analysis and Design (OOAD) are two fundamentally different conceptual modeling systems. RDF models an application domain as a declarative system, in which the semantics of a resource is defined by what it is in relation to others. OOAD , on the other hand, models a domain as an interactive system, in which the semantics of an object[a] is defined by what it does in response to others. This difference is reflected in their respective operational semantics and modeling styles.
1.1. Open vs. closed world assumptions
Open- and closed-world assumptions are two different kinds of operational semantics. System based on closed world assumption assumes that it contains all and only relevant facts. Thus, if the semantic mapping between the language and system knowledge is incomplete, the closed world assumption provides the system a default solution by treating what is not known to be true as false. Most relational databases and programming languages are operated under closed world assumptions.
From a modeling standpoint, OOAD, or the newly established Model Driven Architecture (MDA), does not limit itself on what kind of systems it can support. But its role as the design layer for a software system, nevertheless, dictates its solution to take the closed world approach. The reason is understandable because for computation purpose the type-checking in most, if not all, Object Oriented Programming (OOP) languages, is required to know everything about the type at the runtime[b].
The definition of an OOP class is closed with regard to its relationship to other objects. For instance, given the class definitions shown Figure 1, a Spot instance can only modify its "shape" attribute. Any attempt to establish attributes other than the "shape", for instance, the "virtualGel" shown in the example, would lead to an error.

Figure 1 - Closed world assumption in OOP class.
Modeling in RDF , however, is different. RDF takes the open world approach by acknowledging that there can be unknown facts that are true. The class membership of an RDF resource is acquired by making simple assertions as opposed to instantiations, which will be translated into memory allocations in a programming language. Any systems handling the RDF statements must honor the assertion unless other information within the system prevents the assertion from being true. For a given RDF resource, therefore, neither the absence of an optional property nor the presence of an unknown one would necessarily revoke the resource's declared class membership. For instance, resource http://www.charlestoncore.org/ont/example/spot2 is asserted to be an instance of cce:Spot in this article[1]. Although the engaged spot ontology only designates the constrain for a cce:Spot's shape, it is still valid to assert other properties of the spot, such as sup:virtualGel or dc:creator etc., shown in the following example.
<http://www.charlestoncore.org/ont/example/spot2>sup:virtualGelex:gel3;
dc:creator "Xiaoshu Wang".
This difference in the semantic model taken by a modeling system, subsequently, affects how the system is used to model an application domain. With an open world system, such as RDF , a domain entity is modeled to be consistent with what it is. But with closed-world systems, the entity is modeled to be complete in terms of fulfilling its desired functionalities.
1.1. Declarative vs. procedural
The semantics of RDF data is declarative; they are often the targets of logic based reasoning systems, which could manipulate the data in ways beyond what the system designers can possibly foresee. The main functionality of a logic system is consistency checking. Thus, in a logic based modeling system such as RDF / OWL , whether a property should be constrained to a class is solely determined by its necessity for the class to be what it is designed to be.
The semantics of OOP entities, on the other hand, are different. Because OOP entities are designed for building software systems to achieve a particular result, their semantics are procedural. Whether a property should be specified in a class is mostly determined by its necessity in fulfilling the class's defined behavior. For instance, to build a program for calculating the area of an ellipse, we will be more inclined to adopt the design of foo.Ellipse than that of bar.Ellipse shown in Figure 2. Obviously, the design of bar.Ellipse is not wrong. But within the given context, it would be considered an inferior design because it unnecessarily increases the memory consumption without gaining any desired functionality in return.

Figure 2 Two OOP design of Ellipse
In short, modeling in RDF and modeling in OOP are focused on two different concerns. The former is on what an entity is whereas the latter is on what an entity does. These are two orthogonal concerns so that there will not be any objective criteria that can be used to evaluate an object-resource binding. In other words, an RDF resource can, in principle, be mapped to any OOP object and vice versa. What will ultimately determines a binding will depend on the task at hand. For instance, if the area() method of the classes shown in Figure 2 was implemented by returning 3.14×x-radius×y-radius, binding a cce:Ellipse to them naturally makes sense. But if, on the other hand, our task is to study the shape distribution of some cce:Ellipses, it would also make sense to bind cce:Ellipse to the Point class with x-radius mapped to x-position and y-radius to y-position.
2. How to Approach Object-Resource Binding
2.1. Implicit vs. explicit approach
Ideally, with a logic based knowledge representation (KR) system, such as RDF or OWL , the only programming need is to specify a goal and a set of rules to achieve the goal. The rest of the processing could be simply handed to a general problem solving engine. But in reality, this is unlikely to happen - at least not in the foreseeable future. First, not all semantics can be intensionally described in RDF / OWL . For instance, the application semantics of a df:Transformation - say, a df:WebTransfer - can hardly be explicitly described through its relationships to other resources. Second, the majority of software programs are still written in non-logic based programming languages such as OOP . To use these software programs to process or act upon RDF -encoded knowledge base requires a binding to be made between the RDF structures and programming constructs.
Two possible approaches can be used toward this direction. The first one is the implicit binding approach. In this approach, the binding algorithms are defined in a standard document that is to be followed by all interested parties. This is the usual approach taken by most OOP languages for binding XML data. For instance, for Java programming language, there is Java Architecture for XML binding (JAXB) and for C++, there is C++/Tree Mapping. The advantages of the implicit binding approach are two folds. First, software libraries can be developed against the standard and reused elsewhere. Second, objects generated against the same XML schema can be globally shared.
Unfortunately, the implicit approach will not work well for binding RDF data. The reason, once again, comes from the different semantic model, i.e., open vs. closed-world assumptions, respectively taken by RDF and OOP . Because the open-world assumption ensures that an ontological term can be possibly bound with properties other than those designated in a given ontology, binding against an ontology does not guarantee that the generated objects can be shared elsewhere. For instance, against http://www.charlestoncore.org/ontology/example, the cce:Spot can be easily bound to the following Java class[c].
public class Spot {public Ellipse shape;
}
But against http://www.charlestoncore.org/ontology/supplement, which imports the previous ontology, the same cce:Spot will be bound to the following class that is not quite interchangeable with the previous one.
public class Spot {public Ellipse shape;
public VGel virtualGel;
}
Furthermore, the implicit binding approach may unnecessarily populate an OOP class with properties irrelevant to the class's desired functionalities. For instance, a simple owl:imports of Dublin core (DC) metadata element to the above ontology would make the Spot class be bound with additional fifteen properties because the domains of DC elements are not constrained. But, even if they are constrained to, say a hypothetical dc:Entity, they are still applicable to cce:Spot because the open world assumption would simply interpret a cce:Spot as a dc:Entity as well. Thus, pragmatically speaking, the implicit approach is not a feasible solution either because a few imported ontologies may easily render the definition of OOP classes into an unmanageable state.
Therefore, it appears that the only sensible approach is to define object-resource binding (ORB) explicitly. With this approach, the bindings can be selectively made using a description language. And by publishing the binding description on the web, objects generated against the same description can still be interchangeable.
Two kinds of semantics can be described in an ORB binding. The first one is the data semantics that is typically reflected in the structural relationships among the modeled entities. In RDF , data semantics are commonly established by the domain and range classes of an rdf:Property; in OOP , it is modeled through object composition, where simple types act as the attributes of more complex type. The similarities between their structures should make it quite straightforward for an ORB framework to align the corresponding entities (see Figure 3).

Figure 3 - Two types of semantics that can be described in an ORB. The blue-colored shape denotes OOP concepts and pink-colored shape denotes RDF concepts. The red-dotted line indicates what needs to be described in ORB.
The second type of semantics is an entity's application semantics. Unlike the data semantics that are manifested through an entity's structure, application semantics are manifest through an entity's behaviors. Take the semantics of a df:WebTransfer as an example. Structurally, it is defined by two df:InfoSpace s along with a string property df:mineType (See its definition in stream ontology). But functionally, it denotes the process of filling a df:ByteStream with the bytes retrieved via a network transport protocol. Since the application semantics can be naturally expressed as the methods of an object class in OOP , an ORB framework should also be capable of describing the binding between an RDF class and an OOP method as well.
2.2. Language Choices
Unlike the implicit approach that defines the binding through human languages, the explicit approach should choose a machine understandable language to describe the binding. Since the topic of the description is the binding between RDF and OOP , the choice of language should naturally come down to using either RDF or something similar to a programming language. Obviously, RDF has many of its advantages, which we will not bore you anymore here. But the binding could, nevertheless, be described in a programming syntax as well. For instance, to describe the object-resource binding for Java, using a straightforward Java syntax along with Java Annotations might, in fact, be a better solution. Unlike the implicit binding frameworks that usually derive the structure of classes from a data schema, the explicit binding framework normally explicitly specifies the composition of classes and its corresponding entities. Hence, using the programming language syntax will make it easier to write the class definition as well as to specify the bound RDF entities. But, of course, using programming language syntax has its disadvantages as well because the description would be tied to a specific programming language and cannot take advantage of the wealth of functionalities provided by the RDF . For instance, to support potential roundtrip engineering, the binding between an OOP type and an rdfs:Class should ideally be modeled as a one-on-one relationship. At the first glance, such a constrain may seem restrictive because it would have prohibited us from creating a class that aggregates data from several rdfs:Classes. With RDF , however, the issue can be easily solved by creating an rdfs:Class as the union of the interested rdf:Classes within the description document. But to achieve the same functionality from programming-language syntax, however, would be a more involved process, which would require either developing new syntax or deploying another ontology.
The point that we want to raise here is that we should not take RDF as a panacea for every problems. Although RDF 's data structure - the directed labeled graph - makes it capable of describing any problem, RDF is perhaps best served as the glue technology for integrating other types of engineer artifacts due to its semantic clarity and open world semantics. RDF has a very verbose syntax, which makes it unsuitable to carry large volume of data - one of the reasons that prompt us to develop DFDF . In addition, the open world assumption makes the language difficult to work with under certain circumstances as well. Hence, whether to use RDF in an application domain should be judged by the problem at hand. Two important questions to ask are (1) whether the problem needs an open world solution (2) if the solution needs to be web-enabled, i.e., if URI is important. If the answers to both questions are negative, then it might be easier to use alternative syntax with a closed world model. As to the ORB problem described in this article, we don't yet have a clear answer. Clearly, an ORB binding can be openly extended and interpreted in the web. But we are not sure if it will be needed in practice. Nevertheless, for the sake of language independency and generality, we have developed the Object-Resource-Binding Ontology toward this purpose.
3. Functionalities
An ORB description can be used for three purposes (See Figure 4). First, the description can be used to compile object classes or ontologies. But, since a logic-based ontology language, such as OWL , often carries more semantic capabilities than what can be described in OOP , the binding compilation is more likely to take place in only one direction in generating object classes. Thus, if the binding description is happened to be in a programming language, the binding compilation may become entirely unnecessary.

Figure 4 - Functionalities supported by an ORB framework
The second use of an ORB binding description is to support the building of a runtime API, with which (1) resource instances can be un-marshaled into objects for the appropriate computation and (2) the computed object can be marshaled into an RDF document to be shared within the web.
The third use of an ORB description is to support application building. This function is, in fact, unique to ORB owing to its ability to describe the binding of a resource's application semantics. Take a DFDF application as an example. Because conceptually a series of simple df:Transformations is equivalent to one complicated df:Transformation, the problem of a DFDF application can be reduced to the execution a df:Transformation's application semantics. Thus, given the hypothetical binding shown in Figure 5, the task can be simply accomplished with the following pseudo-code.
1. XformImpl xform = new XformImpl();2.(a) xform.src = srcSpace; //assume srcSpace is an instance of InfoSpace1
(b) xform.dest = destSpace; //assume destSpace is an instance of InfoSpace2
(c) //set other necessary states
3. xform.do(); //execute the application semantics
4. //use destSpace to produce result or further processing
Since the above procedure can be piped to execute any combination of df:Transformations, what a DFDF application needs to do is simply to find out all necessary df:Transformations and the sequence of execution in order to obtain the desired information.

Figure 5 - A hypothetical DFDF application. Blue lined shape shows the concepts modeled in RDF and black lined shape shows the OOP concept. The red lined shapes show the potential bindings and descriptions.
4. References
1. Wang, X., R. Gorlisky, and J.S. Almeida, From XML to RDF: How semantic web technologies will change the design of 'omic' standards. Nature Biotechnology, 2005. 23: p. 1099-1103.
[a] In this document, the word "object" is mostly used to refer to the object described by OOP whereas the word "resource" is used to refer to the object described by RDF.
[b] Of course, the outlook does change to the better when dynamic languages, such as Python, Ruby, are now entering into the picture. Nevertheless, programming language is about "function" so that "code" must be available at runtime.
[c] For the sake of simplicity, the RDF property is bound to a class as a public attribute. A more realistic Java binding algorithm is more likely bind an RDF property to a pair of getter and setter methods.

