"RDF captures associations between subjects and objects" (Michael Daconta).
"What matters is in the connections" (Tim Berners-Lee).
The "Entity-Attribute-Value Paradigm"
The world is very complex and in it objects and relations between objects are perceived. But this complexity is greatly simplified by applying the philosophy or paradigm of attributes. Indeed, many relationships can be expressed by an attribute notation, involving only two components:
The entity.
An entity is something that exists and can be identified and differentiated. It can be abstract or concrete. For example: a specific person, a company, an event, etc.
Attribute.
An attribute is a characteristic, property or aspect of an entity. It corresponds to a certain semantic category (name, color, height, place, date, time, etc.). For example, a person's name, date of birth, etc. Usually an entity has several attributes.
An attribute is composed of:
An attribute name.
A value associated to the attribute name. It can be a data or a group of data.
There is a fuzzy boundary between entity, attribute name and even attribute value.
An attribute name can be considered an entity, albeit an abstract one, e.g., "color".
A value could be an attribute or a group of attributes.
An entity can be defined by a group (set or sequence) of attributes ("name and value" pairs).
An attribute can be an entity, because it can have, in turn, attributes. For example, the attribute "blue color" could have the attribute "light shade".
Sometimes attribute name and attribute value are condensed into a single identifier. For example, bold, instead of "IntensityBold letter".
The entity-attribute-value paradigm provides a unified view of reality by means of a simplifying model that is attributes and constitutes a form of indirect description of reality by means of, a generic, intelligible, meaningful schema.
Types of attributes
The attributes of an entity can be:
Independent.
They are those that identify and describe a certain entity. We can also call them dimensions. For example, in a rectangle, its independent attributes (or dimensions) are the base and the height.
Dependents.
These are the ones that are obtained from the independent ones. In the previous example, it would be the surface of the rectangle.
RDF (Resource Definition Framework)
RDF is a language, model or framework for describing resources, properties and relationships to other resources, using XML syntax encoded metadata. It is a language of choice for the Semantic Web, as it allows ontologies to be defined at a basic level.
The model it uses is called "triple", because it is a relation consisting of three parts:
Subject (the resource).
Predicate (property, attribute or aspect of the subject).
Object (a value or other resource).
That is, there is a descending scale: resources have properties, which in turn have values. For example, a book (resource) has an author (predicate) and a value (object), the author's name. These three elements constitute the entity entity-attribute-value entity.
The set formed by these three elements (subject-predicate-object) constitutes an RDF statement. A collection of RDF statements is a meta-document that describes the properties of a document. An RDF statement can contain other RDF statements.
While XML documents assign metadata (attributes) to parts of a document, one use of RDF is to create metadata (attributes) about the document as a separate entity such as author, creation date, etc.
The same resource can have different predicate-object type pairs. For example, a book might have, in addition to the author, the attributes: publisher, number of pages, year of publication, etc.
An RDF statement can be represented as a network composed of a subject node, an object node and an arc (the predicate) connecting the subject node to the object node. It is a directed acyclic network. By chaining triples together, graphs are constructed, which are semantic networks.
[A graph is a set of objects, called nodes, connected by links called edges. A directed graph is a graph in which the edges have a defined direction. An acyclic graph is a graph in which there are no paths starting and ending at the same node].
The relationship between these three elements can also be thought of as a function. The predicate is the name of the function, the subject is the argument, and the result is the object. In classical notation,
predicate(subject) = object
In the book example, author(book) = author_name. For example,
author("Don Quixote") = Cervantes
Each of these three elements (subject, predicate, object) of an RDF statement are identified on the web by a URI (Universal Resource Identifier), which is a unique and persistent identifier that does not depend on its physical location, unlike the URL (Universal Resource Locator) of web pages, which is a physical address.
Allows you to define classes, subclasses and individuals. A class is a set of individuals that share one or more properties. Individuals belonging to a class are also called instances of the class.
Includes the so-called "containers" of multiple values/resources: Bag (unordered collection), Sequence (ordered collection) and Alternate (choice between multiple values/resources).
Limitations of RDF
Its syntax is complex. This is because its syntax is "forced" to be XML, which means that a simple list of sentences has to be expressed in an unnatural way as a hierarchical structure. Furthermore, this syntax does not allow a clear differentiation between resources and properties.
To overcome these difficulties in encoding the RDF/XML format (XML-based RDF), also called "serialization format", one turns to tools that encode RDF statements in a simpler intermediate format (called N3), based on triples. For example, GRDDL translates HTML or XML format to RDF.
RDF allows statement reification, that is, higher-order statements, where a statement is an object of another statement. But the object of a statement cannot be a collection of statements or statements that satisfy certain properties. Moreover, when reification is used, RDF/XML syntax becomes even more complicated to the point of being almost unreadable. No wonder, then, the statement Dieter Fensel, author of the foreword to [Daconta et al., 2003] that he needed... two months! to understand RDF.
RDF Schema
RDF has (like XML) its own type language or data model: RDF Schema (RDFS). In this case the types are classes (subjects) and properties (predicates).
Allows to define hierarchies of classes (subclasses and superclasses) and properties.
There is inheritance between classes. There is multiple inheritance: a class can belong to more than one class.
Limitations:
As with XML Schema, schema hierarchies are not allowed.
It is a separate language, not integrated, as it sits as a layer above RDF.
It is not a full semantic language.
It is extremely restricted. It only allows you to define ontologies at a very basic level.
Specification in MENTAL
Specification of the entity-entity-value triad
An entity e with a name attribute n and value v is specified in a very simple way by the expression
e/(n/v)
where there is a hierarchy of descending particularization.
For example, table/(color/green), where the entity table is qualified with only one attribute: color/green.
In case the entity has several attributes, a group (sequence or set) can be specified:
e/(n1/v1 n2/v2 ...) or e/{n1/v1 n2/v2 ...}
For example, table/(color/green height/0.8)
Specifying relationships using attributes
A n-ary relationship (i.e., between n elements) can be expressed by n+1 attributes (which are binary relationships). For example, the assertion
John gave a book to Mary
is a ternary relation (between the elements John, book and Mary) that we can express as a certain event e consisting of four attributes:
The relation "is a(an)" is a binary "subject/predicate" type relation. For example,
Socrates/man // Socrates is a man
blue/color // blue is a color
cat/mammal // a cat is a mammal
"Las Meninas"/painting // "Las Meninas" is a painting.
painter/artist // a painter is an artist
Velázquez/painter // Velázquez is a painter
The relation "has a(a)" is a ternary relation of the form.
"entity/(attribute/value)". For example,
cat/(color/black) // cat has the property of being black
table/(NroLegs/3) // table has 3 legs
Difference between blue/color and color/blue;
blue/color is interpreted as "blue is a color" (subject/predicate).
color/blue is an attribute name and its value. They must particularize an entity. For example,
(table/(color/blue).
Advantages of attribute notation
Description flexibility.
Attributes allow us to describe entities in a variety of ways. For example, a point in the plane can be described by its Cartesian coordinates (x, y) or by its polar coordinates (radius, angle). For example,
point/(x/1 y/0.5)
point/(radius/1.118 angle/30)
Handling flexibility.
The attribute is a binary relation that is easier to handle than the n-ary relation. Indeed, it is easy to modify attributes, delete them and add new ones. For example, if to the event e of the previous example we add the information that the book delivery took place in the park, we only need to include another attribute.
(place/"in the park")
Specification of incomplete information. For example, in the assertion
a book was given to Mary
it is sufficient to state only the known attributes and ignore the unknown ones:
e/(action/give)
object/"a book
receiver/Mary)
In contrast, with the n-ary representation it is necessary to give a name to the unknown subject of the action:
x gave a book to Mary
Representation of hierarchies.
In the above example, the tense of the verb "gave" (past tense) was ignored. Therefore, there is the attribute (tense/past) relative not to the event e, but to the attribute
(action/give)
If the book is also blue, we have the attribute (color/blue) relative to the attribute (object/"a book").
Expressiveness.
The notation of attributes is clearer and more expressive, since it makes clear what are the characteristics of the entities and instances. In this way the information is released and made explicit, its semantics is clarified. For example,
Train from Madrid to Barcelona
can be expressed as
Train/(Origin/Madrid Destination/Barcelona)
Specification of entity classes and instances
The specification of entity classes is done by means of parameterized generic expressions. A typical way would be to define as parameters the values of the entity attributes. For example,
Advantages of MENTAL as an entity-attribute-value model
A homogeneous notation is used between entity and attribute, and between attribute name and value. The distinction between data and metadata is thus diluted.
There are not only three levels (entity-attribute-value), but any number of levels, as many as needed. The hierarchy goes from the generic (the entity) to the specific (the data), always descending.
Allows to specify magnitudes (quantity and unit):
Madrid/
((inhabitants/4*million) (altitude/660*m))
(Madrid has 4 million inhabitants and an altitude of 660 meters)
There is a fuzzy boundary between the different levels. An attribute (even a value) can play the role of entity, an attribute can be a value, a sub-attribute, etc. For example:
(color/green)/
((hue/dark)/
(grade/0.9))
Allows specifying fuzzy values such as 0.7*red or 0.8*high.
Allows you to assign attributes to any expression (a function, a rule, a set, etc.).
It is the ideal system for specifying semantic databases or ontologies, defined by attributes, and whose management is simpler, more understandable and transparent.
The model is analogous or equivalent to the one used as markup, replacing XML. The same language is used to describe (via attributes) parts of a document and the entire document. Four languages are not needed: XML, XMLS, RDF and RDFS. A single, simple language is sufficient. As in the case of markup, a complete language with all its combinatorial possibilities is available.
Bibliography
Akerkar, R. Foundations of the Semantic Web. XML, RDF & Ontology. Alpha Science Intl. Ltd., 2009.
Allemang, Dean; Hendler, James. Semantic Web for the Working Ontologist. Effective Modeling in RDFS y OWL. Morgan Kaufmann, 2008.
Daconta, Michael C.; Obrst, Leo J.; Smith, Kevin T. The Semantic Web. A Guide to the Future of XML, Services, and Knowledge Management. Wiley, 2003. Disponible en Internet.
Hielm, Johan. Creating the Semantic Web with RDF. Professional Developer’s Guide. Wiley, 2001.