UNPKG

wikidata-jskos

Version:
532 lines (366 loc) 19.1 kB
# Wikidata JSKOS [![GitHub release](https://img.shields.io/github/release/gbv/wikidata-jskos.svg)](https://github.com/gbv/wikidata-jskos/releases/latest) [![API Status](https://coli-conc-status.fly.dev/api/badge/19/status?label=API)](https://coli-conc.gbv.de/services/wikidata/) [![License](https://img.shields.io/github/license/gbv/wikidata-jskos.svg)](./LICENSE.md) [![Docker](https://img.shields.io/badge/Docker-ghcr.io%2Fgbv%2Fwikidata--jskos-informational)](./docker/README.md) [![Test](https://github.com/gbv/wikidata-jskos/actions/workflows/test.yml/badge.svg)](https://github.com/gbv/wikidata-jskos/actions/workflows/test.yml) [![npm version](http://img.shields.io/npm/v/wikidata-jskos.svg?style=flat)](https://www.npmjs.org/package/wikidata-jskos) [![standard-readme compliant](https://img.shields.io/badge/readme%20style-standard-brightgreen.svg)](https://github.com/RichardLitt/standard-readme) > Access [Wikidata] in [JSKOS] format This node module provides [a web service](#web-service), a [command line client](#command-line-tool), and [a library](#api) to access Wikidata in [JSKOS] format. The data includes Wikidata items as concepts and concept schemes (read) and mappings between Wikidata and other authority files (read and write). ## Table of Contents - [Background](#background) - [Install](#install) - [Docker](#docker) - [Node.js](#nodejs) - [Configuration](#configuration) - [Usage](#usage) - [Web Service](#web-service) - [Authentication](#authentication) - [GET /status](#get-status) - [GET /concepts](#get-concepts) - [GET /concepts/suggest](#get-conceptssuggest) - [GET /mappings](#get-mappings) - [GET /mappings/voc](#get-mappingsvoc) - [GET /mappings/:\_id](#get-mappings_id) - [POST /mappings](#post-mappings) - [PUT /mappings/:\_id](#put-mappings_id) - [DELETE /mappings/:\_id](#delete-mappings_id) - [Command line tool](#command-line-tool) - [wdjskos concept](#wdjskos-concept) - [wdjskos mappings](#wdjskos-mappings) - [wdjskos schemes](#wdjskos-schemes) - [wdjskos update](#wdjskos-update) - [wdjskos find](#wdjskos-find) - [wdjskos mapping-item](#wdjskos-mapping-item) - [API](#api) - [mapEntity](#mapentity) - [Map selected parts of a Wikidata entity](#map-selected-parts-of-a-wikidata-entity) - [Map simplified Wikidata entities](#map-simplified-wikidata-entities) - [mapMapping](#mapmapping) - [Maintainers](#maintainers) - [Publish](#publish) - [Contributing](#contributing) - [License](#license) [mapEntity]: #mapentity [mapMapping]: #mapmapping ## Background [Wikidata] is a large knowledge base with detailed information about all kinds of entities. Mapping its data model to [JSKOS] data format allows simplified reuse of Wikidata as authority file. This implementation is used in the [Cocoda web application](https://coli-conc.gbv.de/cocoda/) but it can also be used independently. The mapping between Wikidata and JSKOS format includes: * Wikidata items expressed as authority records ([JSKOS Concepts](https://gbv.github.io/jskos/jskos.html#concept)) * Selected Wikidata items covering information about authority files ([JSKOS Concept Schemes](https://gbv.github.io/jskos/jskos.html#concept-schemes)) * Selected Wikidata statements linking Wikidata to other authority files ([JSKOS Mappings](https://gbv.github.io/jskos/jskos.html#concept-mappings)) In addition a search service is provided for selecting a Wikidata item with typeahead. Editing Wikidata mapping statements to other authority files requires [authentification](#authentification) via OAuth. The following authority files have been tested succesfully: * Basisklassifikation (BK) * Regensburg Classification (RVK) * Integrated Authority File (GND) * Nomisma * Iconclass Other systems (not including DDC) may also work but they have not been converted to JSKOS yet, so they are not provided for browsing in Cocoda. ## Install ### Docker The easiest way to run wikidata-jskos is via Docker. Please refer to the [Docker documentation](./docker/README.md). ### Node.js Node.js 18 is required (Node.js 20 recommended). ```sh git clone https://github.com/gbv/wikidata-jskos.git cd wikidata-jskos npm ci ``` Optionally make the [command line tool](#command-line-tool) `wdjskos` available: ```sh npm link ``` For development of the web service with hot reload and auto reconnect at <http://localhost:2013/>: ```bash npm run start ``` For deployment of the web service (if not using Docker) there is a config file to use with [pm2](http://pm2.keymetrics.io/): ```bash cp ecosystem.example.json ecosystem.config.json pm2 start ecosystem.config.json ``` ## Configuration You can customize the application settings via a configuration file, e.g. by providing a generic `config.json` file and/or a more specific `config.{env}.json` file (where `{env}` is the environment like `development` or `production`). The latter will have precendent over the former, and all missing keys will be defaulted with values from `config.default.json`. All configuration options can also be set via environment variables (`.env` when running via Node.js or using `environment` or `env_file` in Docker Compose). Some notes: - To use a custom Wikibase instance, you can set the subkeys of the `wikibase` property. Both `instance` and `sparqlEnpoint` are necessary. By default, Wikidata is used. - wikidata-jskos supports saving, editing, and deleting mappings in Wikidata. To enable this, you will need to provide `auth.algorithm` and `auth.key` (algorithm and key to decode the JWT, usually coming from [login-server]), as well as `oauth.consumer_key` and `oauth.consumer_secret` (for your registered OAuth consumer). - `auth.key`/`AUTH_KEY` contain line breaks. In JSON, these can simply be set as `\n`. When using `.env` or `env_file`, the whole key needs to be double-quoted (`"-----BEGIN PUBLIC KEY-----\n..."`). To set `AUTH_KEY` directly in `docker-compose.yml` via `environment`, please look at the included [`docker-compose.yml`](./docker/docker-compose.yml) file or refer to [this StackOverflow answer](https://stackoverflow.com/a/53198865). - Please provide a `baseUrl` when used in production. If no baseUrl is provided, `http://localhost:${port}/` will be used. - List of all available configuration options: | `config.json` key | environment variable | default value | | ----------------------- | -------------------- | ----------------------------------- | | title | TITLE | Wikidata JSKOS Service | | wikibase.instance | WIKIBASE_INSTANCE | `https://www.wikidata.org` | | wikibase.sparqlEndpoint | WIKIBASE_SPARQL | `https://query.wikidata.org/sparql` | | wikibase.api | WIKIBASE_API | `${wikibase.instance}/w/api.php` | | verbosity | VERBOSITY | false | | baseUrl | BASE_URL | `http://localhost:${port}/` | | port | PORT | 2013 | | auth.algorithm | AUTH_ALGORITHM | HS256 | | auth.key | AUTH_KEY | null | | oauth.consumer_key | OAUTH_KEY | null | | oauth.consumer_secret | OAUTH_SECRET | null | The list of concept schemes to read and write mappings to, is hard-coded in directory [assests](assets). To update concept schemes, regularly run: ```bash npm run update ``` ## Usage See below for use of the [Web Service](#web-service), the [command line tool](#command-line-tool), and the JavaScript [API](#api). ## Web Service An instance is available at <https://coli-conc.gbv.de/services/wikidata/>. The service provides selected endpoints of [JSKOS API](https://github.com/gbv/jskos-server#api). ### Authentication The following endpoints require an authenticated user: - [POST /mappings](#post-mappings) - [PUT /mappings/:_id](#put-mappings_id) - [DELETE /mappings/:_id](#delete-mappings_id) Authentication works via a JWT (JSON Web Token). The JWT has to be provided as a Bearer token in the authentication header, e.g. `Authentication: Bearer <token>`. It is integrated with [login-server] and the JWT is required to have the same format as the one login-server provides. Specifically, the OAuth token and secret for the user need to be provided as follows: ```json { "user": { "identities": { "wikidata": { "oauth": { "token": "..", "token_secret": "..." } } } } } ``` There are more properties in the JWT, but those are not used by wikidata-jskos. Note that the JWT needs to be signed with the respective private key for the public key provided in the [configuration](#configuration). Also, the OAuth user token and secret need to come from the same OAuth consumer provided in the config. ### GET /status Returns a JSKOS API status object. See [JSKOS Server] for details. ### GET /concepts Look up Wikidata items as [JSKOS Concepts] by their entity URI or QID. * **URL Params** `uri=[uri]` URIs for concepts separated by `|`. `language` or `languages`: comma separated list of language codes. * **Success Response** JSON array of [JSKOS Concepts] Only some Wikidata properties are mapped to JSKOS fields. The result also contains `broader` links determined by an additional SPARQL request. Deprected alias at `/concept` is going to be removed. ### GET /concepts/suggest OpenSearch Suggest endpoint for typeahead search. [JSKOS Concept Schemes]: https://gbv.github.io/jskos/jskos.html#concept-schemes [JSKOS Server]: https://github.com/gbv/jskos-server [JSKOS Concepts]: https://gbv.github.io/jskos/jskos.html#concept [JSKOS Concept Mappings]: https://gbv.github.io/jskos/jskos.html#concept-mappings [Wikidata properties for authority control]: http://www.wikidata.org/entity/Q18614948 Deprected aliases at `/concept/suggest` and `/suggest` are going to be removed. ### GET /mappings Look up Wikidata mapping statements as [JSKOS Concept Mappings] between Wikidata items (query parameter `from`) and external identifiers (query parameter `to`). At least one of both parameters must be given. * **URL Params** `from=[uriOrNotation1|uriOrNotation2|...]` specify the source URI or notation (multiple URIs/notations separated by `|`) `to=[uriOrNotation1|uriOrNotation2|...]` specify the target URI or notation (multiple URIs/notations separated by `|`) `fromScheme=[uri|notation]` only show mappings from this concept scheme (URI or notation) `toScheme=[uri|notation]` only show mappings to this concept scheme (URI or notation) `language` or `languages` enables inclusion of entity labels. A comma separated list of language codes is used as preference list. `mode=[mode]` specify the mode for `from`, `to`, one of `and` (default) and `or` `direction=forward|backward|both` searches mappings from `from` to `to` (default), reverse, or in both directions `limit=[number]` maximum number of mappings to return (not fully implemented) `offset=[number]` start number of mappings to return (not fully implemented) Concept Schemes are identified by BARTOC IDs (e.g. <http://bartoc.org/en/node/430>`). * **Success Response** JSON array of [JSKOS Concept Mappings] * ***Examples*** `?from=http://www.wikidata.org/entity/Q42` `?to=http://d-nb.info/gnd/119033364` Mapping relation types ([P4390]) are respected, if given, see for example mapping from Wikidata to <http://d-nb.info/gnd/7527800-5>. [P1921]: http://www.wikidata.org/entity/P1921 [P1793]: http://www.wikidata.org/entity/P1793 [P1629]: http://www.wikidata.org/entity/P1629 [P2689]: http://www.wikidata.org/entity/P2689 [P4390]: http://www.wikidata.org/entity/P2689 ### GET /mappings/voc Look up Wikidata items with [Wikidata properties for authority control] as [JSKOS Concept Schemes] with used for mappings. These schemes need to have a BARTOC-ID ([P2689]), and be subject item ([P1629]) of an external identifier property with statements [P1921] (URI template) and [P1793] (regular expression). * **URL Params** None. * **Success Response** JSON array of [JSKOS Concept Schemes] ### GET /mappings/:_id Returns a specific mapping for a Wikidata claim/statement. * **Success Response** JSKOS object for mapping. * **Error Response** If no claim with `_id` could be found, it will return a 404 not found error. * **Sample Call** ```bash curl https://coli-conc.gbv.de/services/wikidata/mappings/Q11351-9968E140-6CA7-448D-BF0C-D8ED5A9F4598 ``` ```json { "uri": "http://localhost:2013/mappings/Q11351-9968E140-6CA7-448D-BF0C-D8ED5A9F4598", "identifier": [ "http://www.wikidata.org/entity/statement/Q11351-9968E140-6CA7-448D-BF0C-D8ED5A9F4598", "urn:jskos:mapping:content:2807c55eac85ed8c0c9254ff04b457f89b801ac9", "urn:jskos:mapping:members:daafcd8580e6f0304f0b1cee024f65f04da98a3c" ], "to": { "memberSet": [ { "uri": "http://rvk.uni-regensburg.de/nt/VK", "notation": [ "VK" ] } ] }, "type": [ "http://www.w3.org/2004/02/skos/core#exactMatch" ], "fromScheme": { "uri": "http://bartoc.org/en/node/1940", "notation": [ "WD" ] }, "toScheme": { "uri": "http://bartoc.org/en/node/533", "notation": [ "RVK" ] }, "from": { "memberSet": [ { "uri": "http://www.wikidata.org/entity/Q11351", "notation": [ "Q11351" ] } ] }, "@context": "https://gbv.github.io/jskos/context.json" } ``` ### POST /mappings Saves a mapping in Wikidata. Requires [authentication](#authentication). Note that if an existing mapping in Wikidata is found with the exact same members, that mapping will be overwritten by this request. * **Success Reponse** JSKOS Mapping object as it was saved in Wikidata. ### PUT /mappings/:_id Overwrites a mapping in Wikidata. Requires [authentication](#authentication). * **Success Reponse** JSKOS Mapping object as it was saved in Wikidata. ### DELETE /mappings/:_id Deletes a mapping from Wikidata. Requires [authentication](#authentication). * **Success Reponse** Status 204, no content. ## Command line tool The command line client `wdjskos` provides roughly the same commands as accessible via [the web service](#web-service). Mapping schemes are cached in the subfolder `./cache`. To update the cache include option `--force` or run command `update`. ### wdjskos concept Look up Wikidata items as [JSKOS Concepts]. wdjskos concept Q42 ### wdjskos mappings Look up [JSKOS Concept Mappings]. wdjskos mappings Q42 | jq .to.memberSet[].uri wdjskos mappings - http://viaf.org/viaf/113230702 A single hyphen (`-`) can be used to nullify argument `from` or `to`, respectively. Mappings can be limited to a target scheme. These are equivalent: wdjskos --scheme P227 mappings Q42 wdjskos --scheme 430 mappings Q42 wdjskos --scheme http://bartoc.org/en/node/430 mappings Q42 ### wdjskos schemes Return up [JSKOS Concept Schemes] with [Wikidata properties for authority control]. ### wdjskos update Look up concept schemes from Wikidata and update the cache. ### wdjskos find Search a Wikidata item by its names and return OpenSearch Suggestions response. ### wdjskos mapping-item Convert a JSKOS mapping to a Wikidata item. wdjskos mapping-item mapping.json wdjskos --simplfiy mapping-item mapping.json ## API The node library can be used to convert Wikidata JSON format to JSKOS ([mapEntity]) and to convert JSKOS mappings to Wikidata JSON format ([mapMapping]). ### mapEntity ```js jskos = wds.mapEntity(entity) ``` Entity data can be retrieved via Wikidata API method [wbgetentities] and from Wikidata database dumps. See JavaScript libraries [wikidata-sdk] and [wikidata-filter] for easy access and processing. #### Map selected parts of a Wikidata entity All methods return a JSKOS item. ```js jskos = wds.mapIdentifier(entity.id) // { uri: "http://www.wikidata.org/entity/...", notation: [ "..." ] } jskos = wds.mapLabels(entity.labels) // { prefLabel: { ... } } jskos = wds.mapAliases(entity.aliases) // { altLabel: { ... } } jskos = wds.mapDescriptions(entity.descriptions) // { scopeNote: { ... } } jskos = wds.mapSitelinks(entity.sitelinks) // { occurrences: [ { ... } ], subjectOf: [ { url: ... }, ... ] } jskos = wds.mapClaims(entity.claims) // ... // convert claims with mapping properties jskos = wds.mapMappingClaims(claims) jskos = wds.mapInfo(entity) // ... ``` #### Map simplified Wikidata entities Each method has a counterpart to map simplified Wikidata entities. ```js jskos = wds.mapSimpleEntity(entity) jskos = wds.mapSimpleIdentifier(entity.id) jskos = wds.mapSimpleLabels(entity.labels) ... ``` ### mapMapping Convert a JSKOS mapping into a Wikidata claim. Only respects JSKOS fields `from`, `to`, `uri`, and `type` (if given) and only supports 1-to-1 mappings from a single Wikidata item to a concept in another concept scheme. *this is work in progress!* ## Maintainers - [@nichtich](https://github.com/nichtich) ### Publish **For maintainers only** Please work on the `dev` branch during development (or better yet, develop in a feature branch and merge into `dev` when ready). When a new release is ready (i.e. the features are finished, merged into `dev`, and all tests succeed), run the included release script (replace "patch" with "minor" or "major" if necessary): ```bash npm run release:patch ``` This will: - Check that we are on `dev` - Run tests and build to make sure everything works - Make sure `dev` is up-to-date - Run `npm version patch` (or "minor"/"major") - **Ask you to confirm the version** - Push changes to `dev` - Switch to `main` - Merge changes from `dev` - Push `main` with tags - Switch back to `dev` After running this, GitHub Actions will automatically create a new GitHub Release draft. Please edit and publish the release manually. ## Contributing PRs accepted against the `dev` branch. To enbale debugging ouput set enviroment variable `DEBUG` to comma-separated list of components (`sparql`, `http`, `query`). Please lint JavaScript code (e.g. run `npm run lint` or `npm run fix`). If editing the README, please conform to the [standard-readme](https://github.com/RichardLitt/standard-readme) specification. ## License [MIT © 2024 Verbundzentrale des GBV (VZG)](LICENSE.md) [wbgetentities]: https://www.wikidata.org/w/api.php?action=help&modules=wbgetentities [wikidata-sdk]: https://github.com/maxlath/wikidata-sdk [wikidata-cli]: https://github.com/maxlath/wikidata-cli [wikidata-filter]: https://github.com/maxlath/wikidata-filter [Wikidata]: https://www.wikidata.org/ [JSKOS]: https://gbv.github.io/jskos/ [login-server]: https://github.com/gbv/login-server