Index
General
RDF model API
- Why does the localname part of my URI look wrong?
- How do I change the URI or localName of a Resource?
Reasoner and inference models
- I want to develop my own rules, how do I get started?
- Why are there two different arrows ( -> and <- ) in the rule syntax?
- The domain and range inferences look wrong, is that a bug?
- Why do I get a warning: Creating OWL rule reasoner working over another OWL rule reasoner
Ontology API
- Why doesn't listClasses() (listProperties()/listIndividuals(), etc) work?
- Why doesn't the ontlogy API handle
sub-class
(orsub-property
,domain
,range
, etc) relationships in a DAML model? - Why does
.as( OntProperty.class )
fail withConversionException
on SymmetricProperty (or other property types)?
Database and persistence
- Why do I get an exception when trying to create a new persistent model?
- Why do I run out of memory when trying to list statements in a persistent model?
- Has Jena2 persistence been ported to other database engines and platforms besides those officially supported?
- Is there a limit on the number of models in a database?
XML serialisation (reading and writing)
RDQL and query processing
- no entries yet
Miscellaneous
Answers
General
Q: Why do I get a ClassNotFoundException when I run Jena?
A: This means that one or more of the libraries that Jena depends on is not
on your classpath. Typically, all of the librarys (.jar files) in $JENA/lib,
where $JENA refers to the directory in which you installed Jena, should be on
your classpath. Consult the documentation for your JDK for details on setting
the classpath for your system. There are also a number of on-line tutorials for setting
the Java classpath. Consult Google or see
here.
RDF model API
Q. Why does the localname part of my URI look wrong?
A: In Jena it is possible to retrieve the localname part of a Resource or Property
URI. Sometimes developers create Resources with a full URI reference but
find that the result of a getLocalName call is not quite what they expected.
This is usually because the URI is ill-formed or cannot be correctly split
in the way you expected. The only reason for separating namespace and local
name is to support the XML serialization in which qnames are used for properties
and classes. Thus the main requirement of the split is that the localname
component must be a legal XML NCName. This means it must start with a letter
or _ character and can only contain limited punctuation. In particular,
they can't contain spaces, but then spaces are not legal in URI references
anyway. In general, it is best to not use the localname split to encode
any information, you should only be concerned with it if you are coding
a parser or writer.
Q. How do I change the URI or localName of a Resource?
A: In Jena, the URI of a resource is invariant.
So there is no setLocalName()
, or setURI()
method,
and there will never be one.
The only way to "rename" a resource is to remove all of the statements that
mention resource R, add add new statements with R replaced by R'.
A utility for doing this is provided:
com.hp.hpl.jena.util.ResourceUtils.renameResource()
If you are working with inference or ontology models, you need to be
careful to do this on the base model, not the entailment model.
Reasoner and inference models
Q. I want to develop my own rules, how do I get started?
A: The GenericRuleReasoner is the place to start. You can create instances of this reasoner by
supplying either an explict set of Rule objects or a configuration description (as a Jena Model)
that points to a local rule file. See the inference documentation for more details:
inference/index.html#rules
Q. Why are there two different arrows ( -> and <- ) in the rule syntax?
A: As explained in the documentation there are two rule systems available - a
forward chainer and a backward chainer. You can chose to use either or
use the two together in a hybrid mode.
So if we use Ti as short hand for triple patterns like (?x rdf:type ?C),
and if we ignore functors and procedural call out for now, then the syntax:
T1, T2, ... TN -> T0 .
means that if the triple patterns T1 to TN match in the data set then
then the triple T0 can be deduced as a consequence. Similarly
T0 <- T1, T2, ... TN .
means the same thing - the consequence is always on the "pointy" end of
the arrow.
Now if you are just using pure forward or backward rules then you could
chose to use either syntax interchangeably. This allows you to write a
rule set and use it in either mode. Though in practice "->" is the more
conventional direction in forward systems and "<-" is the more
conventional one in backward systems.
The hybrid configuration allows you to create new backward rules as a
result of forward rules firing so that the syntax:
T1, T2 -> [T0 <- T3, T4] .
Is saying that if both T1 and T2 match in the dataset then add the
backward rule "[T0 <- T3, T4]" after instantiating any bound variables.
Q. The domain and range inferences look wrong, is that a bug?
A: The way rdfs range and domain declarations work is completely alien to
anyone who thinks of RDFS and OWL as being a bit like a type system for
a programming language, especially an object oriented language. Whilst there
may be bugs in the inference rule sets the most common explanation for supprising
results, when listing inferred domains and ranges, is this mismatch in expectations.
Suppose we have three classes eg:Man
is an rdfs:subClassOf
eg:Person
is an rdfs:subClassOf
eg:Animal
.
Suppose we a property eg:personalName
which is declared to
have rdfs:domain
eg:Person
. Now the question is
what other values can be inferred for the rdfs:domain
of eg:personalName
?
In pure RDFS no additional conclusions can be made. The definition of
domain and range is intensional not extensional. It only works
forward. Declaring <eg:personalName rdfs:domain eg:Person>
means that anything to which eg:personalName
is applied can
be concluded to be of type eg:Person
. It does not work backward
- if you somehow knew that all things to which eg:personalName
applied were also Foo
's you cannot conclude that <eg:personalName
rdfs:domain Foo>.
However, RDFS permits systems to strengthen the meaning of domain and range
to be extensional, so that valid domain and range deductions can be made.
OWL makes use of this option. So in OWL, then in our example we can also
deduce that <eg:personalName rdfs:domain eg:Animal>
.
If you are used to object oriented programming this may look wrong. It is
tempting, but incorrect, to think of rdfs:domain as meaning this is the
class of objects to which this property can be applied. With that mind-set
you might expect to find that <eg:personalName rdfs:domain eg:Man>
,
after all every eg:Man
is an eg:Person
so it is
always "legal" to apply eg:personalName
to an eg:Man
.
That is true, it is legal, any eg:Man
is allowed to have a
a eg:personalName
but rdfs:domain
does not describe
what is legal. The statement <P rdfs:domain C>
just means
all things to which P is applied can be inferred to have class C.
You can see that if we tried to infer <eg:personalName rdfs:domain
eg:Man>
then we would start concluding that anything with a name
was a man which is not right - every Man can have a name but non-Man Persons
are also allowed to have names in this example.
Q: Why do I get a warning: Creating OWL rule reasoner working over another OWL rule reasoner
A: If you create an infernce graph explicitly from an OWL reasoner or implicitly
(by using OntModelSpec.OWL_*_RULE) then it is best if the argument models
(data and schema) are plain models. It is easy to accidently misuse the API and
create an inference model working over the results of another inference model.
This is a redunancy which significantly affects performance to no useful effect.
To help detect this situation we have added a warning message. The best way
to stop the message is to change your model construction code so that only the
final InfModel/OntModel is specified to use OWL inference. If this is not
appropriate for some reason you can disable the check and warning messages
using the global flag com.hp.hpl.jena.shared.impl.JenaParameters.enableOWLRuleOverOWLRuleWarnings.
Ontology API
Q: Why doesn't listClasses()
(or listProperties()
/listIndividuals()
,
etc) work?
A: It does work. Extensive unit tests are used to check the correctness of Jena,
and are included in the downloaded source code for your reference. If listClasses()
,
or a similar method, is not producing the answers you expect, or no answers
at all, you should first check that your model is correctly defined. Print a
copy of your model as a debug step, to see if the URI's match up (e.g, if you
are expecting resource x to be an individual of class Y, check that the rdf:type
of x is the same as the URI of the class declaration for Y). A common problem
is that relative URI's change depending where you read the model from. Try adding
an xml:base
declaration to the document to ensure that URI's are
correctly specified.
Q: Why doesn't the ontlogy API handle sub-class
(or sub-property
, domain
, range
, etc)
relationships in a DAML model?
A: These relationships are handled correctly, but the results you see are dependent on the
model configuration. The DAML specification includes a number of aliases for RDFS constructs to
copy them into the DAML+OIL namespace. This means that, for a DAML processor, daml:subClassOf
and rdfs:subClassOf
are equivalent. This is declared by means of a
daml:samePropertyAs
in the daml+oil.daml specification document. Without a reasoner
attached to the model, the ontology API will not recognise the equivalence with rdfs:
properties.
Thus, if you are not seeing the expected results when processing a DAML ontology,
it is likely that your ontology file contains, for example,
<daml:Class rdf:ID="A"> <rdfs:subClassOf rdf:resource="B" /> ...
To fix this, either ensure that the ontology consistently uses daml:
relationships,
or declare the ontology model with the DAML micro rule-reasoner:
OntModel m = ModelFactory.createOntologyModel( OntModelSpec.DAML_MEM_RULE_INF, null );
Q: Why does .as( OntProperty.class )
fail with
ConversionException
on SymmetricProperty (or other property types)?
A: This is a slightly tricky issue. Internally, .as()
calls the supports check,
which tests whether the node that is being converted is a common flavour of property.
Strictly, the only necessary test should be 'has rdf:type rdf:Property
',
because that is entailed by all of the other property types. However, that requires
the user to use a model with a reasoner, and some don't want to (for good reasons, e.g. building an editor).
The other position is to test for all the possible variants of property: object property,
datatype property, annotation, ontology, transitive, funcitonal, inverse functional, etc etc.
The problem with this is that it duplicates the work of the reasoner, and my expectation was that
most people would be running with a reasoner. Thus my code would be duplicating the functionality
of the reasoner, which is bad design. The compromise solution was to make the supports check test
for the common (top level) property types. Users who aren't using the reasoner,
can either test explicitly for the other property types they expect to encounter (e.g. SymmetricProperty),
or can turn off the supports check by setting
strict mode to false on the model.
Database and persistence
Q: Why do I get an exception when trying to create a new persistent model?
A: Assuming that your program uses correct methods to create the model (see
examples in the database howto), it may be that
your database files are corrupted. Jena2 does not do a good job in
checking the validity of the database. It makes a cursory check that some
required tables exist but does not check that the tables contain valid data. If
you suspect your database has been corrupted, you may invoke
cleanDB()
on a DBConnection object
prior to creating your model. This removes all Jena2 tables from a database.
Warning: this removes any other existing Jena2 models from the database so make sure
that this is what you want to do.
Q: Why do I run out of memory when trying to list statements in a persistent
model?
A: Jena2 uses the JDBC interface for accessing databases. The JDBC
specification has no cursors. Consequently, when a query is processed by JDBC,
the entire result set is returned from the database at once and the application program then iterates
over the in-memory result set. If the result set is large, as is often the case
when listing all statements of a large model, it may exceed the heap size of
the Java virtual machine. If you suspect this is happening, you might try to
increase the heap size of the Java virtual machine (-vmargs
-Xmx500M
for a 500 MB heap size). If this does not help, there is no
other work-around and the program should be recoded.
Q: Has Jena2 persistence been ported to other database engines and platforms
besides those officially supported?
A: The Jena team supports Jena2 persistence on the databases and operating
systems listed in the Database Interface Release
Notes.
Other users have had success porting Jena2 to other databases and platforms.
Jena2 has been ported to IBM's DB2 database. Contact
Liang-Jie Zhang for details. A port to
HSQLDB, a native-Java SQL database, has been done. Contact
Jan Danils for details. On the Mac OS X
platform, Jena2 has been successfully run using MySQL. Contact
Jim Nachlin for details. A Jena2
driver for Microsoft SQL Server has been written and tested with MsSQL 2000. The
source is available at
http://www.ur.se/jena/Jena2MsSql.zip. Contact
Erik Barke for more information.
Q: Is there a limit on
the number of models in a database?
A: The limit depends on the Jena database (schema) configuration and the
database engine (MySQL, PostgreSQL, Oracle, etc). Recall that a Jena model may
either be stored separately in its own database tables (the default) or,
alternatively, in tables that are shared with other models (see
StoreWithModel in the options for
persistent models). Also, a Jena model is identified internally by a 32 bit
integer. Consequently the maximum number of models is limited either by the
maximum number of tables allowed in a database (which depends on the database
engine) or by the maximum value of a 32 bit integer, i.e., 2G.
XML serialisation (reading and writing)
Q: Why are some of my statement objects serialised as XML elements, and some as XML attributes?
A: This is a feature of the default RDF/XML writer settings in Jena 2.0.
Semantically, writing the object of a statement as an XML element or XML attribute makes
no difference to the RDF meaning. However, we have recognised that this is confusing, so
from Jena-2.1-dev3 onwards, the default is to write out all statements objects as XML elements.
The process that is behind the output is the invocation of a rule in the RDF/XML writer, that
writes values that contain no spaces as attributes. This rule can be turned on or off
selectively by using an explicit RDFWriter
object, rather than relying on the
defaults that apply by calling Model.write()
. The means of doing this are described
in the I/O howto. In particular, the rule propertyAttr
must be enabled or disabled.
RDQL and query processing
no entries yet
Miscellaneous
Q: What versions of library jars does Jena require?
A: Jena makes use of several third party java libraries. Copies of
each of these is included in the $JENA/lib area of the distribution and we recommend
including all of these jars in your classpath. In some circumstances applications
already make use of specific versions of these libraries (e.g. Xerces) and need to
check if the version they are using is compatible with those shipped by Jena. The current
library versions used by Jena are:
- Xerces - 2.6.1 (not compatible with Xerces prior to 2.6.0)
- log4j - 1.2.7
- junit - ?
- icu4j - ?
- jakarta-oro - 2.0.5
- antlr - 2.7.2
- jakarta-commons-logging - 1.0.3
- concurrent - 1.3.2