UNPKG

bibutils.js

Version:

Node wrapper for Chris Putnam's bibutils program set

477 lines (365 loc) 16.9 kB
## Purpose Provides dictionaries to get useful metadata about various bibliography formats. Provides convenience nodejs wrapper for Chris Putnam's `bibutils` program set, allowing conversion between subsets of bibliography formats. ```javascript var bibutils = require('bibutils.js'); // Acquire the appropriate format identifiers var fromFormat = bibutils.formats.constants.from.BIBTEX; var toFormat = bibutils.formats.human.to['RIS']; // Get the metadata we want var toMime = bibutils.metadata.mime[toFormat]; // => 'application/x-research-info-systems' var toExtension = bibutils.metadata.extension[toFormat]; // => '.ris' // Get the sample BibTeX string to convert. var myBibliographyString = bibutils.sampleBibtexString; // Convert between the two formats bibutils.convert(fromFormat, toFormat, myBibliographyString, function (data) {  // Prints the BibTeX sample converted to RIS format console.log(data); }); ``` ## Installation This is a [Node.js](https://nodejs.org/en/) module available through the [npm registry](https://www.npmjs.com/). Installation is done using the [`npm install` command](https://docs.npmjs.com/getting-started/installing-npm-packages-locally): ```bash $ npm install bibutils.js ``` ## Features Provides human readable name, MIME type and file extension information for several bibliography formats. Allows conversion from and to subsets of those bibliography formats, using the [Library of Congress](https://www.loc.gov/)'s [Metadata Object Description Schema](http://www.loc.gov/standards/mods/) (MODS) version 3.1 as an intermediate format. Conversion is supported from the following formats: * BibTeX * BibLaTeX * [COPAC](http://copac.jisc.ac.uk/help/export-help.html#tagged) * EBI XML * EndNote * EndNote XML * [ISI Web of Science](http://wiki.cns.iu.edu/pages/viewpage.action?pageId=1933374) * Pubmed XML * NBIB MEDLINE * RIS * Word 2007 Bibliography * [MODS](http://www.loc.gov/standards/mods/) Conversion is supported to the following formats: * ADS - [NASA Astrophysics Data System](https://en.wikipedia.org/wiki/Astrophysics_Data_System) [Tagged Format](http://doc.adsabs.harvard.edu/abs_doc/help_pages/taggedformat.html) * BibTeX * EndNote * [ISI Web of Science](http://wiki.cns.iu.edu/pages/viewpage.action?pageId=1933374) * RIS * Word 2007 Bibliography * [MODS](http://www.loc.gov/standards/mods/) ## Quick Start Install the module to your project: ```javascript $ npm install bibutils.js ``` Include the module in your project: ```javascript var bibutils = require('bibutils.js'); ``` Acquire the appropriate format identifiers for your required usage, see the [Format Identifier Acquisition](#format-identifier-acquisition) section. The example below selects the fromFormat with a constant identifier, and the toFormat using the human readable string, `'RIS'`. ```javascript var fromFormat = bibutils.formats.constants.from.BIBTEX; var toFormat = bibutils.formats.human.to['RIS']; ``` ### Get Metadata Get any metadata you need from the `.metadata` object: ```javascript var fromMime = bibutils.metadata.mime[fromFormat]; var fromExtension = bibutils.metadata.extension[fromFormat]; var fromHumanReadableName = bibutils.metadata.human[fromFormat]; var toMime = bibutils.metadata.mime[toFormat]; var toExtension = bibutils.metadata.extension[toFormat]; var toHumanReadableName = bibutils.metadata.human[toFormat]; ``` ### Convert Between Formats Write a callback to be called when conversion is completed. `data` is the converted bibliography as a string. ```javascript var callback = function (data) { console.log(data); }; ``` Providing the convert function with a bibliography string in the same format as specified by fromFormat, (in this example, BibTeX) call the conversion: ```javascript // Get some bibliography string var myBibliographyString = bibutils.sampleBibtexString; // Convert bibutils.convert(fromFormat, toFormat, myBibliographyString, callback); ``` ## Get Format Metadata `bibutils.js` provides mappings from format identifier (see [format identifier acquisition](#format-identifier-acquisition)) to MIME type, human readable name, and file extension. These are provided by the `.metadata.mime`, `.metadata.human`, and `.metadata.extension` objects, respectively. ```javascript // Get MIME type, human name, and extension for ISI documents var identifier = bibutils.formats.constants.to.ISI; var mimeType = bibutils.metadata.mime[identifier]; var extension = bibutils.metadata.extension[identifier]; var human = bibutils.metadata.human[identifier]; ``` `bibutils.js` does not currently have a specific MIME type for the ADS Tagged Format, EBI XML format, COPAC formatted reference, or the Word 2007 Bibliography format. It also does not currently have a specific extension for the COPAC formatted reference. Sensible defaults have been assumed. If you know that these exist, please submit a pull request with an appropriate reference! ## Convert Between Bibliography Formats When converting between bibliography formats, you must specify to `bibutils.js` which format you are converting from, and which format you are converting to. This requires acquisition of the [format identifier](#format-identifier-acquisition). You must also write your application to use callbacks; `bibutils.js`'s `convert` function is asynchronous. ```javascript var callback = function (data) { console.log(data); }; // Get some bibliography string var myBibliographyString = bibutils.sampleBibtexString; // Convert bibutils.convert(fromFormat, toFormat, myBibliographyString, callback); ``` ### Options `bibutils.js` allows for passthrough of arguments to the `bibutils` program set. These are not checked for correctness by `bibutils.js` and no protections are provided. Each conversion performed by `bibutils.js` uses two excutions of a `bibutils` program. `inFormat -> MODS`, then `MODS -> outFormat`. For instance, if asking to convert from RIS to BibTeX, your bibliography string is first converted from RIS to MODS, and then from MODS to BibTeX. You can pass in two arrays as optional parameters to provide arguments. The first array provides parameters for the conversion from a format to MODS, and the second array provides parameters for the conversion from MODS to a format. ```javascript var from = bibutils.formats.constants.from.RIS; var to = bibutils.formats.constants.to.BIBTEX; var options1 = ['-as','./test.txt']; var options2 = ['-U']; bibutils.convert(from, to, risString, callback, options1, options2); ``` The above specifies a file containing a list of names that should be left as is for the RIS -> MODS conversion (`-as ./text.txt`), and specifies that all BibTeX tags/types should be in uppercase (`-U`). You can find the arguments accepted by the `bibutils` program that you are using by [reading the official bibutils documentation](https://sourceforge.net/p/bibutils/home/Bibutils/) or running it manually with the `-h` argument. ## Format Identifier Acquisition To use `bibutils.js`, you need to indentify which format you are using. `bibutils.js` exposes the formats it accepts with the `.formats` variable. Formats can be specified [using constants](#identifier-via-constants), [using human readable names](#identifier-via-human-readable-name), by [looking up the MIME type](#identifier-via-mime-type), or by [looking up the file extension](#identifier-via-file-extension). The preferred methods are by constants or by human readable names. Selecting by MIME type or file extension has unfortunate ambiguities. Please note that although each of these methods returns some identifier, these identifier values should never be hardcoded. Select them via constants instead. This isolates your application from the implementation details of `bibutils.js`. ```javascript // Correct var identifierCorrect = bibutils.formats.constants.from.BIBTEX; // Incorrect var identifierWrong = 'bib'; ``` ### Identifier via Constants For convenience with using the `bibutils.js` `.convert` function, constants have been specified in two objects. `.formats.constants.from` is an object of all pairings that `bibutils.js` can convert from, and `.formats.constants.to` is an object of all pairings that it can convert to. ```javascript bibutils.formats.constants.from: { BIBTEX : 'bib', COPAC : 'copac', ENDNOTE_REFER : 'end', ENDNOTE_TAGGED : 'end', ENDNOTE : 'end', ENDNOTE_XML : 'endx', ISI_WEB_OF_SCIENCE : 'isi', ISI : 'isi', PUBMED_XML : 'med', PUBMED : 'med', METADATA_OBJECT_DESCRIPTION_SCHEMA : 'xml', MODS : 'xml', RIS_RESEARCH_INFORMATION_SYSTEMS : 'ris', RIS : 'ris', }; bibutils.formats.constants.to = { NASA_ASTROPHYSICS_DATA_SYSTEM : 'ads', ADS : 'ads', BIBTEX : 'bib', ENDNOTE : 'end', ENDNOTE_REFER : 'end', ISI_WEB_OF_SCIENCE : 'isi', ISI : 'isi', RIS_RESEARCH_INFORMATION_SYSTEMS : 'ris', RIS : 'ris', WORD_2007_BIBLIOGRAPHY : 'wordbib', WORDBIB : 'wordbib', METADATA_OBJECT_DESCRIPTION_SCHEMA : 'xml', MODS : 'xml', }; ``` Example use of constants to convert from RIS to ADS. ```javascript var convertFrom = bibutils.formats.constants.from.RIS_RESEARCH_INFORMATION_SYSTEMS; var convertTo = bibutils.formats.constants.to.NASA_ASTROPHYSICS_DATA_SYSTEM; ``` ### Identifier via Human Readable Name It's quite likely that you may want a user to be able to select the format. `bibutils.js` provides a set of human readable values that you can access for this purpose. `.formats.human.from` and `.formats.human.to` are the same as the `.formats.constants.from` and `.formats.constants.to` objects, but with duplicates removed and with the key set to a human readable string. This makes it perfect for displaying to the user; You can get the human readable list with `Object.keys()`, allow the user to select them in a dropdown, and then acquire the correct format for use in the `.convert()` function. ```javascript var humanReadable = Object.keys(bibutils.formats.human.to); // => ['ADS Tagged Format', 'BibTeX', 'EndNote', 'ISI', ...] var selection = humanReadable[1]; var bibtexIdentifier = bibutils.formats.human.to[selection]; ``` ### Identifier via MIME types In some scenarios, the only thing you know about the data you have is it's MIME type. `bibutils.js` contains a mapping from MIME type to format that you can try to use. This is `.formats.mime`. ```javascript var endnoteIdentifier = bibutils.formats.mime['application/x-endnote-library']; var isiIdentifier = bibutils.formats.mime['application/x-inst-for-scientific-info']; ``` If the correct MIME type isn't given (returning incorrect identifier), or if `bibutils.js` doesn't know about the MIME type (returns `undefined`), this could easily fail. Thus, accessing via MIME type is not a recommended method of indentifier acquisition. Additionally, not all formats have a unique MIME type. In the case of a MIME type that has multiple associated possible formats, an array of these is returned: ```javascript var xmlIdentifers = bibutils.formats.mime['application/xml']; // => ['endx','ebi','med','wordbib','xml'] ``` `bibutils.js` does not currently have a specific MIME type for the ADS Tagged Format, EBI XML format, COPAC formatted reference, or the Word 2007 Bibliography format. If you know of one, please submit a pull request with an appropriate reference! `bibutils.js` holds the following mapping: ```javascript bibutils.formats.mime = { 'application/x-bibtex' : ['bib', 'biblatex'], 'application/x-endnote-library' : 'endx', 'application/x-endnote-refer' : 'end', 'text/x-pubmed' : 'med', 'application/nbib' : 'nbib', 'application/x-inst-for-scientific-info' : 'isi', 'application/x-research-info-systems' : 'ris', 'application/mods+xml' : 'xml', //Nonstandard ones 'application/xml' : ['ebi','endx','med','wordbib','xml'], 'text/plain' : ['ads','bib','biblatex','copac','end','isi','nbib','ris'], 'text/x-bibtex' : 'bib', 'text/x-endnote-library' : 'endx', 'text/x-endnote-refer' : 'end', 'text/mods+xml' : 'xml', 'text/x-research-info-systems' : 'ris', 'text/x-inst-for-scientific-info' : 'isi', //Nature uses this 'text/application/x-research-info-systems' : 'ris', //Cell uses this 'text/ris' : 'ris', }; ``` ### Identifier via File Extensions As with MIME types, in some scenarios, the only thing you know about your strings' format is the extension of the file it was read from. `bibutils.js` contains a mapping from file extension to format that you can use. This is `.formats.extension`. ```javascript // Convert from BibTex to ISI var convertFrom = bibutils.formats.extension['.bib']; var convertTo = bibutils.formats.extension['.isi']; ``` Additionally, not all formats have a unique MIME type. In the case of a MIME type that has multiple associated possible formats, an array of these is returned: ```javascript var xmlIdentifers = bibutils.formats.extension['.xml']; // => ['endx','ebi','med','wordbib','xml'] ``` `bibutils.js` does not currently have a specific extension for the COPAC formatted reference. `.txt` is assumed. If you know of one, please submit a pull request with an appropriate reference! `bibutils.js` holds the following mapping: ```javascript bibutils.formats.extension = { '.ads' : 'ads', '.bib' : ['bib','biblatex'], '.end' : 'end', '.isi' : 'isi', '.nbib' : 'nbib', '.ris' : 'ris', '.xml' : ['endx','ebi','med','wordbib','xml'], //copac unknown - default to .txt '.txt' : 'copac', }; ``` ## Tests To run the test suite, first install the dependencies, then run `npm test`: ```bash $ npm install $ npm test ``` ## Versions and Operating Systems `bibutils.js` should work on Linux, OSX, and Windows. It packages the `bibutils` binaries in 64bit Linux, 64bit OSX, and 32bit Windows. ## People The writer of the [bibutils](https://sourceforge.net/p/bibutils/home/Bibutils/) [program set](http://bibutils.refbase.org/) is [Chris Putnam](https://sourceforge.net/u/cdputnam/profile/). The writer of this nodejs module is [Jet Holt](https://github.com/Jetroid). [List of all contributors](https://github.com/Jetroid/bibutils.js/graphs/contributors) ## Licenses This module contains both code licensed under [GPL-2.0](GPL_LICENSE) and [MIT](MIT_LICENSE). The code licensed under GPL-2.0 is contained in the `/bibutils/` folder. That is, unmodified, compiled binaries of Chris Putnam's `bibutils` program set. The source code to these binaries is included in `/bibutils/` as a gzipped source tarball, complying with the GPL-2.0 License. The code licensed under MIT is the node.js code, that is, all code but that found in the `/bibutils/` folder. The MIT-licensed code included in `bibutils.js` * has not modified Chris Putnam's `bibutils` source code * does not contain any of Chris Putnam's `bibutils` source code (that is to say `bibutils.js` is not statically nor dynamically linked to `bibutils`) * executes Chris Putnam's `bibutils` compiled binaries as seperate processes (ie fork-exec) with their own address spaces, and does not establish intimate communication (sharing internal data structures) * can exist and (barring the `.convert` functionality) continue to operate without the `bibutils` binary ...therefore we believe that `bibutils.js` is not considered a 'derivate work' of `bibutils` as defined in the GPL-2.0 license. (`bibutils.js` does not contain `bibutils` as a whole or in part as an executable, and has made no modifications.) The non-`bibutils` code and the `bibutils` code are considered [different programs](https://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.html#GPLPlugins). `bibutils.js` could be said to execute `bibutils` programs as plugins to enhance `bibutils.js`'s feature set. `bibutils.js` and `bibutils` are [merely aggregated](https://www.gnu.org/licenses/old-licenses/gpl-2.0-faq.html#MereAggregation). They have been distributed together as seperate programs.