This is an old revision of the document!


RIMMF6

RIMMF6 is a continuing development of what began as RIMMF4.

(RIMMF5 was an experiment that was never released.)

Current release: 20230615

List of recent changes: rimmf4:changes6

RIMMF6 can be downloaded here:
https://www.marcofquality.com/sft/setupRimmf6.exe

A portable installation of RIMMF6 is available from this link:
https://www.marcofquality.com/sft/setupRimmf6.zip

Exporting data

This page is still in progress

This page describes

  1. how the program stores your data
  2. the data options available for an export
  3. the options available for an import

Your RIMMF data

Data format

When an entity record is created and saved in RIMMF, the data is stored in a diskfile using the N-Triples format: 'a line-based, plain-text serialisation format for RDF graphs'–wikipedia. These N-Triples files use the windows file extension '.nt'. Note: although the 'official' internet media type for an .nt file is “application/n-triples”, they are in essence “text/plain” files. The .nt files produced by RIMMF can be opened, viewed, edited, etc., by any text editor.

As to the actual data within these files, we apply the following conventions.

Unicode characters

Unicode characters are stored in escaped format. For example, the characters which comprise the displayed string

John Le Carré

will be stored as

John Le Carr\u00E9

Here “\u00E9” represents the hexadecimal Unicode code point for “é” (e acute). The hexadecimal number must be exactly four digits long1). When converting and deconverting between unicode characters and their escaped representations, RIMMF uses Normalization Form C.

Data exported in RIMMF6 is always unicode-escaped.

RDA Elements

RIMMF stores RDA Elements, or properties, using opaque identifiers. For example, when storing a triple for the RDA element 'Title of work', the URI

http://rdaregistry.info/Elements/w/P10088

will be used.

The creators of the RDA Registry developed an alternate way of identifying elements called a lexical alias. Using this property, a triple for the RDA element 'Title of work' would be represented as

http://rdaregistry.info/Elements/w/titleOfWork.en

This is a convenient naming convention to use during debugging, as opaque Ids do not easily support human comprehension. Unfortunately, despite the implied support for translation2), a lexical alias value is not included in any of the RDA translations3).

In addition, RIMMF stores only canonical RDA elements. We do not store triples using the object or datatype subclasses typically defined for each RDA entity. (Data that is imported to RIMMF using these subclasses will be mapped to the corresponding canonical class).

Statements about statements

As you know, a triple is a simple statement, comprised of three terms:

  1. subject,
  2. predicate (or property), and
  3. object (value).

In order to support provenance, applications need to uniquely identify each statement. Given a unique identifier, a statement can be treated as a resource about which additional statements can be made. The term used in RDF for saying 'something about' a statement, or statements, is reification.

There are several ways to reify a statement in RDF. In RIMMF, the N-Quads format is used to assign unique identifiers to statements. N-Quads are an extension of N-Triples in which an optional fourth part, called a graph label, is appended after the object. The graphLabel assigned by RIMMF is always a unique IRI.

Another way to reify a statement is to use the built-in RDF reification vocabulary; find out more about that here. Note that at present there is no standard, universally accepted, means of reification.

RIMMF-specific data and metadata

The subject of every RIMMF statement is assigned the namespace:

http://rimmfdata.com/

Subdomains may be used to categorise statements, for example:

Note that the application metadata assigned to the '/m' namespace is data that RIMMF uses internally: the version of RIMMF used to create an entity record, the windows filename in the local storage, the entity template used to create it, timestamps, and so on. An application on the receiving end of this data can safely ignore triples in the the '/m' namespace.

Export options

In RIMMF6, an option to export records is available:

  • When viewing the EI
  • When viewing an R-Tree
  • When viewing a Manifestation 4)

When exporting data in RIMMF6, the following options are available:

By Default is meant:

  • RDA Opaque Ids
  • Graph labels for reification

The other options on the form refer to processing options described above:

  • LexicalAlias identifiers
  • RDF reification vocabulary.

In RIMMF6, data is always exported as N-Triples and the output file is assigned a '.nt' file extension. In the past, RIMMF supported '.zip' versions of an export. Given the relatively small file sizes involved using RIMMF, and the blocking of '.zip' email attachments by many institutions, this option has been removed. The user can easily zip an exported data file themselves if needed.

Brief example

An example based on the the Title proper of a Manifestation follows, for each of the four export options. A single provenance statement is included (Note: in RDA, every statement, taken on its own, is considered an RDA Work).

Default export

<http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/P30156> "Love Me Do" <http://rimmfdata.com/r/rks425/15> .
<http://rimmfdata.com/r/rks425/15> <http://rdaregistry.info/Elements/w/P10219> "20220511T160408" .

Export with Lexical alias Ids

<http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/titleProper.en> "Love Me Do" <http://rimmfdata.com/r/rks425/15> .
<http://rimmfdata.com/r/rks425/15> <http://rdaregistry.info/Elements/w/dateOfWork.en> "20220511T160408" .

Export with RDF reiification vocabulary

<http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/P30156> "Love Me Do" .
<http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> .
<http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> <http://rimmfdata.com/r/rks425> .
<http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> <http://rdaregistry.info/Elements/m/P30156> .
<http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> "Love Me Do" .
<http://rimmfdata.com/r/rks425/15> <http://rdaregistry.info/Elements/w/P10219> "20220511T160408" .

Export with Lexical alias Ids and RDF reiification vocabulary

<http://rimmfdata.com/r/rks425> <http://rdaregistry.info/Elements/m/titleProper.en> "Love Me Do" .
<http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> .
<http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> <http://rimmfdata.com/r/rks425> .
<http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> <http://rdaregistry.info/Elements/m/titleProper.en> .
<http://rimmfdata.com/r/rks425/15> <http://www.w3.org/1999/02/22-rdf-syntax-ns#object> "Love Me Do" .
<http://rimmfdata.com/r/rks425/15> <http://rdaregistry.info/Elements/w/dateOfWork.en> "20220511T160408" .

Importing data

Importing refers to adding RDA entity records to your RIMMF6 environment.

The data being imported must use the RDA vocabularies and the N-triples format. If working from a different serialization, like RDF XML, convert it to N-triples first. The character encoding must be escaped unicode (Some utilities that, like raptor, convert RDFXML to N-Triples also convert UTF-8 to escaped unicode).

The interface to the “Import records” utility is located on the main menu under the “Tools” option; when selected, the following form is displayed:

To import a file of RDA entity records, drag and drop the file onto this form.

All of the options supported during an export are automatically supported during an import. When the file is first dropped onto the import form, the program parses the file with the goal of determining whether:

  • the file was produced by a supported version of RIMMF (RIMMF4- )
  • the RDA elements use LexicalAlias or Opaque Ids
  • the reification method is N-Quads or RDF vocabulary

In the case of the latter two items, any needed conversions–from LexicalAlias to Opaque, from RDF vocabulary to N-Quads–will be performed during the initial parse to render the triples into the program's Default format.

If the import process is successful, the user is prompted to enter a folder name; the imported records will be added to the new folder; a subsequent option automates the “Change data folder” action and opens an Entity Index on the new folder.

Notes and Exceptions

For the most part, the import tool expects the incoming data to have been generated by RIMMF6.

RIMMF4 data folders can be dropped onto the RIMMF6 import form with good results, but data from any earlier RIMMF will fail.

There is an attempt to support non-RIMMF data provided it uses N-triples and RDA Elements. This support is activated by selecting the External data box before dropping the file onto the form. If successful, and if there are non-RDA properties in the file, they will be added to the respective entity record in a raw format–i.e. instead of displaying a human-readable label for a statement, the “Element Label” column in the RIMMF display will contain the property URI used in the imported triple.

Note that, at least in the current release, the External data box must be unchecked when impporting RIMMF6 data5).

When the import tool prompts for a foldername–

–take care that the new foldername does not already exist. If it does, the import process will fail and need to be restarted.

When RIMMF6 (or in some case, RIMMF4) produces an export file (using the Export option in the Entity Index menu), a comment line will appear at the beginning of each entity record.

Something like this:

# BEGIN http://rimmfdata.com/r/rks13637
<http://rimmfdata.com/r/rks13637> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdaregistry.info/Elements/c/C10004> .

This comment may be useful when parsing the file but it has no semantic purpose.

2023/06/30 20:27 · Rick

RV

RV (“Registry Viewer”) is a tool that we have developed over the years to help us navigate/understand/research/troubleshoot the RDA Registry.

The input to RV is the ntriples serialization of any registry release. The output consists of:

  1. various reports and data analyses
  2. several different views of the data
  3. customizable filters, extracts, etc.
  4. the element sets used by RIMMF4 and RIMMF6
  5. differencing tools (compare two releases of RDA Registry)

This utility is typically updated after each registry release.

Download the latest RV update (5.0.7) https://rimmf.com/data/rvinstall-507.exe

RIMMF4

RIMMF4 (R4) is our attempt to model changes to RDA introduced by LRM and the 3R project.

Work on R4 began in 2017/2018.

The objectives/status of R4 are:

  • replace the homegrown text output used by previous RIMMFs with an RDF ntriples serialization [complete]
  • replace customized element tables with RDA Registry linked data [complete]
  • support new RDA concepts: recording methods, nomens, machine-readable SES [complete]
  • support data provenance using RDA itself [completed in R6]
  • separate the EI (entity index) from the discovery interface [to do]
  • improve uri management which will be needed by triplestore cataloging [to do]
  • conversion of RIMMF3 data folders to RDF [in progress as a separate app]
  • develop an API for storing/sharing user-created vocabulary terms [beyond the scope of RIMMF]

R4 does not have any awareness of MARC. For MARC-to-RDA conversion RIMMF3 will be more useful 6).

RIMMF4 development has ended. It is being continued by RIMMF6 (above).

Availability

Last release: 20210421

A list of changes in the current (and recent) releases may be found HERE.

R4 can be downloaded from this link:
https://www.marcofquality.com/sft/setupRimmf4.exe

A portable installation of R4 is available from this link:
https://www.marcofquality.com/sft/setupRimmf4.zip

R4 runs on the Microsoft Windows operating system.

The installation documentation for RIMMF3 can be used for R4–substitute 'RIMMF4' for 'RIMMF3' where applicable and ignore the download links7).

R4 can be installed without harming existing RIMMF installations and data on the same computer.

RIMMF3 to RIMMF4 conversion

A workflow for converting a RIMMF3 data folder to RIMMF4 linked data is under construction (here is a link to some notes on this topic).

If you have RIMMF3 data that you would like converted to RIMMF4, please contact us. We can use your data to test the conversion, and potentially return the data in the new format.

1)
More information is available from the original N-Triples specification: https://www.w3.org/TR/rdf-testcases/#ntrip_strings
2)
evidenced in the .en suffix
3)
i.e. only English values are available
4)
and selecting 'Export all records in set' from the 'Rda record set' submenu
5)
but not RIMMF4, strangely enough; this anomaly is something to be resolved in a future update
6)
Keeping in mind the element set used by RIMMF3 is based on RDA Registry version 2.7.3, published in Oct. 2017
7)
unless you want to download RIMMF3
rimmf4/start.1688157062.txt.gz · Last modified: 2023/06/30 20:31 by Rick
Back to top
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki