Skip to content
Logo

Generate RDF quads from a MS Access database

Friday, 10 March 2023. Door: Redmer Kronemeijer.

RDF – a.k.a. linked data, a.k.a. linked open data, a.k.a. the technology underbedding knowledge graphs, a.k.a. data meshes – is the best way to integrate data from different source. Sometimes, such a source is Microsoft Access database. Therefore I’ve made @rdmr-eu/rdfjs-source-msaccess to enable MS Access databases as an RDF source in NodeJS code, as well as generate quads from the command line. That way, Access databases can be used in the wider ecosystem of RDF-JS libraries, like Comunica, which allows us to make federative queries also over Access databases.

I’ve made this for my job where a information management tool initially was developed inside Access, but as the product evolved into a fully-fledged ontology, I needed to find a way to export the information contained therein. The package builds upon mdb-reader and provides a RDF-JS Source to be compatible .

The package can be used both as a library in NodeJS, or as a command line tool. You can try out the latter with npx @rdmr-eu/rdfjs-source-msaccess -i DATABASE.accdb. Password-protected databases, alternative base IRIs, output to stdout and a .nq-file are all supported, as well as two different output modes.

Two different modelling modes

In the transformation pipeline we used at work, we initially used an easy quad-structure: the table name was the graph name, the row number the triple subject, the column name the triple predicate and the value was its object. As an example, the following table:

Id (Long) DateAdded (Date/Time) ContentBody (Text with markup) Published (Boolean)
1 10-3-2023 This is content Ja
2 9-3-2023 Old Nee

… would generate the following RDF data:

<csv:table/ContentTable> {
  <csv:table/ContentTable/row/1> <csv:Id> "1"^^xsd:long ;
    <csv:DateAdded> "2023-03-10T00:00:00.000Z"^^xsd:dateTime ;
    <csv:ContentBody> "<div>This is <strong>content</strong></div>" ;
    <csv:Published> "true"^^xsd:boolean .

  <csv:table/ContentTable/row/2> <csv:Id> "2"^^xsd:long ;
    <csv:DateAdded> "2023-03-09T00:00:00.000Z"^^xsd:dateTime ;
    <csv:ContentBody> "<div><em>Old</em></div>" ;
    <csv:Published> "false"^^xsd:boolean .
}

Lately, we’ve been using SPARQL-Anything (if I could write Java and there’d be a good MS Access library for Java, I’d’ve written this library for SPARQL-Anything!) which is very useful to create quads from all kinds of non-RDF data sources. The modelling paradigm called Facade-X is somewhat different and based on the spreadsheet format, I’ve added another output mode for @rdmr-eu/rdfjs-source-msaccess to make it resemble Facade-X.

<https://example.org/data#ContentTable> {
  [] rdf:type fx:root ;
    rdf:_1 _:ContentTable1 ;
    rdf:_2 _:ContentTable2 .

  _:ContentTable1 xyz:Id "1"^^xsd:long ;
    xyz:DateAdded "2023-03-10T00:00:00.000Z"^^xsd:dateTime ;
    xyz:ContentBody "<div>This is <strong>content</strong></div>" ;
    xyz:Published "true"^^xsd:boolean .

  _:ContentTable2 xyz:Id "2"^^xsd:long ;
    xyz:DateAdded "2023-03-09T00:00:00.000Z"^^xsd:dateTime ;
    xyz:ContentBody "<div><em>Old</em></div>" ;
    xyz:Published "false"^^xsd:boolean .
}

I hope @rdmr-eu/rdfjs-source-msaccess will prove to be as useful for others as it has been for us. It’s open source licensed under MPL-2 and open for feedback and pull requests at GitHub.

# Permalink
Logo

Redmer Kronemeijer is data-architect bij CROW, waar hij werkt met Linked data.

  • Archief
  • Mee bezig
  • Contact
  • Twitter
  • Mastodon
  • GitHub
  • LinkedIn
  • Privacybeleid
  • Impressum
© 2016-2023. Alle rechten voorbehouden.