Table of Contents
Exporting and importing data in RIMMF6
This page is still in progress
This page describes
- how the program stores your data
- the data options available for an export
- the options available for an import
Your RIMMF data
Data format
When an entity record is created and saved in RIMMF, the data is stored in a diskfile using the N-Triples format: 'a line-based, plain-text serialisation format for RDF graphs'–wikipedia. These N-Triples files use the windows file extension '.nt'. Note: although the 'official' internet media type for an .nt file is “application/n-triples”, they are in essence “text/plain” files. The .nt files produced by RIMMF can be opened, viewed, edited, etc., by any text editor.
As to the actual data within these files, we apply the following conventions.
Unicode characters
Unicode characters are stored in escaped format. For example, the characters which comprise the displayed string
John Le Carré
will be stored as
John Le Carr\u00E9
Here “\u00E9” represents the hexadecimal Unicode code point for “é” (e acute). The hexadecimal number must be exactly four digits long1). When converting and deconverting between unicode characters and their escaped representations, RIMMF uses Normalization Form C.
Data exported in RIMMF6 is always unicode-escaped.
RDA Elements
RIMMF stores RDA Elements, or properties, using opaque identifiers. For example, when storing a triple for the RDA element 'Title of work', the URI
http://rdaregistry.info/Elements/w/P10088
will be used.
The creators of the RDA Registry developed an alternate way of identifying elements called a lexical alias. Using this property, a triple for the RDA element 'Title of work' would be represented as
http://rdaregistry.info/Elements/w/titleOfWork.en
This is a convenient naming convention to use during debugging, as opaque Ids do not easily support human comprehension. Unfortunately, despite the implied support for translation2), a lexical alias value is not included in any of the RDA translations3).
In addition, RIMMF stores only canonical RDA elements. We do not store triples using the object or datatype subclasses typically defined for each RDA entity. (Data that is imported to RIMMF using these subclasses will be mapped to the corresponding canonical class).
Statements about statements
As you know, a triple is a simple statement, comprised of three terms:
- subject,
- predicate (or property), and
- object (value).
In order to support provenance, applications need to uniquely identify each statement. Given a unique identifier, a statement can be treated as a resource about which additional statements can be made. The term used in RDF for saying 'something about' a statement, or statements, is reification.
There are several ways to reify a statement in RDF. In RIMMF, the N-Quads format is used to assign unique identifiers to statements. N-Quads are an extension of N-Triples in which an optional fourth part, called a graph label, is appended after the object. The graphLabel assigned by RIMMF is always a unique IRI.
Another way to reify a statement is to use the built-in RDF reification vocabulary; find out more about that here. Note that at present there is no standard, universally accepted, means of reification.
RIMMF-specific data and metadata
The subject of every RIMMF statement is assigned the namespace:
http://rimmfdata.com/
Subdomains may be used to categorise statements, for example:
- http://rimmfdata.com/r – statements in the record
- http://rimmfdata.com/m – application metadata
Note that the application metadata assigned to the '/m' namespace is data that RIMMF uses internally: the version of RIMMF used to create an entity record, the windows filename in the local storage, the entity template used to create it, timestamps, and so on. An application on the receiving end of this data can safely ignore triples in the the '/m' namespace.
Export options
In RIMMF6, an option to export records is available:
- When viewing the EI
- When viewing an R-Tree
- When viewing a Manifestation 4)
When exporting data in RIMMF6, the following options are available:
The defaults are:
- Include both metadata options
- Use N-Quad reification
- Use RDA Opaque Ids
- Use Canonical Ids
In RIMMF6, data is always exported as N-Triples and output files are assigned a '.nt' file extension. 5)
Brief example
An example based on the the Title proper of a Manifestation follows, for each of the four export options. A single provenance statement is included (Note: in RDA, every statement, taken on its own, is considered an RDA Work).
Default export
<http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/P30156> "Love Me Do" <http://rimmfdata.com/r/rks425/15> . <http://rimmfdata.com/r/rks425/15> <http://rdaregistry.info/Elements/w/P10219> "20220511T160408" .
Export without statement metadata
<http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/P30156> "Love Me Do" .
Export with Lexical alias Ids (instead of Opaque Ids)
<http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/titleProper.en> "Love Me Do" <http://rimmfdata.com/r/rks425/15> . <http://rimmfdata.com/r/rks425/15> <http://rdaregistry.info/Elements/w/dateOfWork.en> "20220511T160408" .
Export with subclass Ids (instead of Canonical Ids)
<http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/datatype/P30156> "Love Me Do" <http://rimmfdata.com/r/rks425/15> . <http://rimmfdata.com/r/rks425/15> <http://rdaregistry.info/Elements/w/datatype/P10219> "20220511T160408" .
Export with RDF reification vocabulary (instead of N-Quads)
<http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/P30156> "Love Me Do" . <http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> . <http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> <http://rimmfdata.com/r/rks425> . <http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> <http://rdaregistry.info/Elements/m/P30156> . <http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> "Love Me Do" . <http://rimmfdata.com/r/rks425/15> <http://rdaregistry.info/Elements/w/P10219> "20220511T160408" .
The options above can appear in various combinations; for instance, if Lexical Ids and Subclassed properties are selected, the result would be:
<http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/datatype/titleProper.en> "Love Me Do" <http://rimmfdata.com/r/rks425/15> . <http://rimmfdata.com/r/rks425/15> <http://rdaregistry.info/Elements/w/datatype/dateOfWork.en> "20220511T160408" .
and so on.
Use original URIs
This option is enabled when the program is exporting data previously imported using the “External data” switch. In this case, RIMMF will have created a map–named
#uri-to-rimmf.map
–of external URIs to local URIs, and saved it to the
__history
directory of the data folder.
If applicable, this option may support round-tripping data into RIMMF and out again. The recommended settings to roundtrip are:
- Include metadata–both options off
- Include RDFS labels–user choice
- Use opaque URIs–select whichever applied on import
- Use canonical URIs–select whichever applied on import
Metadata for the entity itself may be a user choice, as it is easy to filter this particular metadata, and it does not require reification–so set the reification option to “N/A”. In this case the 'about-ness' is handled by RDA itself. See the “Metadata note” that follows for details.
Metadata
In RIMMF, metadata about the description set itself is exported in a separate namespace:
<http://rimmfdata.com/m/>
and typically includes statements such as:
<http://rimmfdata.com/m/rks425> <http://rdaregistry.info/Elements/w/categoryOfWork.en> "metadata work" . <http://rimmfdata.com/m/rks425> <http://purl.org/dc/terms/modified> "2025-03-10T12:01:46" ^^<http://www.w3.org/2001/XMLSchema#dateTime> .
and so on.
The description set and the metadata about it are linked by RDA inverse elements (continuing with the example above)
<http://rimmfdata.com/m/rks425> <http://rdaregistry.info/Elements/w/metadataDescriptionOfManifestation.en> <http://rimmfdata.com/r/rks425> . <http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/manifestationDescribedWithMetadataBy.en> <http://rimmfdata.com/m/rks425> .
So, take care: <http://rimmfdata.com/m/rks425> <> <http://rimmfdata.com/r/rks425>
Notes
In RDA, every statement, taken on its own, may be considered a metadata work; thus one might add the following triple to each example:
<http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdaregistry.info/Elements/c/C10001> .
Importing data
Importing refers to adding RDA entity records to your RIMMF6 environment.
The data being imported must use the RDA vocabularies and the N-triples format. If working from a different serialization, like RDF XML, convert it to N-triples first. The character encoding must be escaped unicode (Some utilities that, like raptor, convert RDFXML to N-Triples also convert UTF-8 to escaped unicode).
The interface to the “Import records” utility is located on the main menu under the “Tools” option; when selected, the following form is displayed:
To import a file of RDA entity records, drag and drop the file onto this form.
All of the options supported during an export are automatically supported during an import. When the file is first dropped onto the import form, the program parses the file with the goal of determining whether:
- the file was produced by a supported version of RIMMF (RIMMF4- )
- the RDA elements use LexicalAlias or Opaque Ids
- the reification method is N-Quads or RDF vocabulary
In the case of the latter two items, any needed conversions–from LexicalAlias to Opaque, from RDF vocabulary to N-Quads–will be performed during the initial parse to render the triples into the program's Default format.
If the import process is successful, the user is prompted to enter a folder name; the imported records will be added to the new folder; a subsequent option automates the “Change data folder” action and opens an Entity Index on the new folder.
Notes and Exceptions
For the most part, the import tool expects the incoming data to have been generated by RIMMF6.
RIMMF4 data folders can be dropped onto the RIMMF6 import form with good results, but data from any earlier RIMMF will fail.
There is an attempt to support non-RIMMF data provided it uses N-triples and RDA Elements. This support is activated by selecting the External data box before dropping the file onto the form. If successful, and if there are non-RDA properties in the file, they will be added to the respective entity record in a raw format–i.e. instead of displaying a human-readable label for a statement, the “Element Label” column in the RIMMF display will contain the property URI used in the imported triple.
Note that, at least in the current release, the External data box must be unchecked when impporting RIMMF6 data6).
When the import tool prompts for a foldername–
–take care that the new foldername does not already exist. If it does, the import process will fail and need to be restarted.
When RIMMF6 (or in some case, RIMMF4) produces an export file (using the Export option in the Entity Index menu), a comment line will appear at the beginning of each entity record.
Something like this:
# BEGIN http://rimmfdata.com/r/rks13637 <http://rimmfdata.com/r/rks13637> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdaregistry.info/Elements/c/C10004> .
This comment may be useful when parsing the file but it has no semantic purpose.