jupyter-ijavascript-utils
Version:
Utilities for working with iJavaScript - a Jupyter Kernel
438 lines (350 loc) • 22 kB
Markdown
<p align="center">
<a href="https://jupyter-ijavascript-utils.onrender.com/" alt="Documentation">
<img src="https://img.shields.io/badge/Documentation-here-informational" />
</a>
<a href="https://jupyter-ijavascript-utils.onrender.com/LICENSE" alt="License">
<img src="https://img.shields.io/badge/License-MIT-green" />
</a>
<img src="https://img.shields.io/badge/Coverage-98-green" />
<a href="https://www.npmjs.com/package/jupyter-ijavascript-utils" alt="npm">
<img src="https://img.shields.io/badge/npm-%5E1.X-red" />
</a>
<a href="https://github.com/paulroth3d/jupyter-ijavascript-utils" alt="npm">
<img src="https://img.shields.io/badge/github-here-black" />
</a>
<a href="https://mybinder.org/v2/gh/paulroth3d/jupyter-ijavascript-utils/main" alt="Launch Binder">
<img src="https://mybinder.org/badge_logo.svg" />
</a>
</p>
# Overview
This is a library to help people that understand JavaScript
to leverage for using Jupyter with the iJavaScript kernel
as a way to load and explore data, and ultimately tell compelling stories with visuals.
Jupyter is a way to programmatically explore a subject and interleave text and markdown to make Data Driven Documents.
See the [#Installation section for requirements and installation](#install)
Notebooks are a way to explore and experiment, in addition to write and explain ideas.
**All of the tutorials provided here, including this one, was written as a notebook and simply exported.**

The document itself is a list of cells that can be either:
* Text or Markdown (like this)
* Or executable Code (NodeJS code in this case)
* rendering images, interactive charts, maps or even generated text based on data.

**note: You can choose whether to show the code or simply show the results.**
## This Library helps with the Coding Side of Things
The [jupyter-ijavascript-utils](https://www.npmjs.com/package/jupyter-ijavascript-utils) library is simply a collection of utility methods for Node and JavaScript Developers interested in Data Science.
* **Load**
* (ex: { module:file}, { module:datasets} or the { e_gettingData} tutorial)
* **Manipulate and Refine**
* (ex: { module:array}, { module:object}, or { module:set})
* **Aggregate, Manipulate and Explore**
* (ex: { module:group}, { module:aggregate} or the { f_dataframes} tutorial)
* **Format / Visualize**
* (ex: { module:format}, { module:vega}, or the { g_visualization} tutorial)
* **Export**
* (ex: { TableGenerator}, or the { h_exporting} tutorial)
Currently, we assume you'll be using [nriesco's iJavaScript Jupyter Kernel](https://github.com/n-riesco/ijavascript) and the [Jupyter Lab - the latest interface for Jupyter](https://jupyter.org/) - and the installation is fairly simple in the {@tutorial howToUse} guide. (Although suggestions welcome)
This is not intended to be the only way to accomplish many of these tasks, and alternatives are mentioned in the documentation as available.

## Running on Binder
[mybinder.org](https://mybinder.org/) is a great place to run a Jupyter Notebook online.
It means you can run Jupyter Notebooks with additional kernels without having to install anything,
and can try right in your browser.
Give it a try here:
[](https://mybinder.org/v2/gh/paulroth3d/jupyter-ijavascript-utils/main?labpath=example.ipynb)
## What's New
* 1.60 - Make post-processing of documents easier with { module:ijs.markDocumentPosition|ijs.markDocumentPosition}
* 1.59 -
* #95 - give control with page breaks. So we can render text before the page break (like for headers) - or even get the html used and render it how we want. (ex: { module:ijs.generatePageBreakStylesHTML|ijs.printPageBreak})
* #96 - Support { module:ijs.internalComment|ijs.internalComment} in notebooks (meaning it can render in markdown, but can be disabled when preparing to print)
* 1.58 -
* #91 - add { module:object.splitIntoDatums|splitIntoDatums(collection, fieldsToSplitBy)} - to make working with datum libraries - like vega-lite - easier
* #92 - make using the cache easier for { module:file.useCache|file.useCache()} - as you can say to use the cache, and it will still work if the cache file does not exist yet.
* #93 / #94 - bug fixes
* 1.57 - #86 - include examples on binning based on time series { https://github.com/paulroth3d/jupyter-ijavascript-utils/issues/86|see more here}
* #87 - add in { module:file.fileExists|file.fileExists} to make things easier for new people getting started
* #89 - allow { module:array.resize|array.resize} to work with defaults, if zipping arrays of different sizes
* #90 - correct issue with timezone offsets, so it is no longer possible to get a timezone offset +24:00 from any of the localizations - like { module:date.toLocalISO|date.toLocalISO}
* #92 - make it easier to handle big expensive calculations with a cache, see { module:ijs.useCache|ijs.useCache()} and { module:file.useCache|file.useCache()}
* 1.56 - #84 (object.renamePropertiesFromList), #82 (date.getWeekday)
* 1.55 - #76, #77, #74, #78
* 1.54 - additional Date logic, and formatting. #70 #71 #72
* 1.53 - additional docs and examples for { module:color|color/colour} package.
* 1.52 - print a date to ISO in local time with { module:date.toLocalISO|date.toLocalISO}
* 1.51 - added in { module:date|date} - and addressed issue #69, #68, #67, #66
* 1.50 - added in { module:color|color/colour} - and addressed issue #65
* 1.49 - Additional documentation for issues #18, #63, #62, #61 (like table.generateObjectCollection)
* 1.48 - Correct table rendering html if filter was used ({ https://github.com/paulroth3d/jupyter-ijavascript-utils/issues/64|#64})
* 1.47
* allow conversion from a Collection of Objects to Arrays and back with object. ({ module:object.objectCollectionFromArray|object.objectCollectionFromArray}, { module:object.objectCollectionToArray|object.objectCollectionToArray})
* Easier iterating over values and peeking with array with { module:array~PeekableArrayIterator|PeekableArrayIterator}
* Support for delayed and asynchronous chaining of functions ({ module:array.delayedFn|delayedFn}, { module:array.chainFunctions|chainFunctions}, etc)
* 1.46 - Make it easier to extract data from "hard-spaced arrays" - { module:array.multiLineSubstr}, { module:array.multiStepReduce}
* 1.45 - more ways to understand the data - { module:aggregate.coalesce|aggregate.coalesce()}, convert properties to arrow/dot notation / reverse it { module:object.flatten|object.flatten()} / { module:object.expand|object.expand()} and { module:object.isObject|object.isObject()}
* 1.43 - esm module fix since still not supported yet in ijavascript
* 1.41 - { module:object.propertyInherit|object.propertyInherit} - to simplify inheriting values from one record to the next
* 1.40 - { module:array.extract|array.extract} and { module:array.applyArrayValues|array.applyArrayValues} to allow for extracting values from arrays, transforming them on a separate process and applying them deeply and safely
* 1.39 - { module:format.extractWords|format.exportWords} - to identify distinct words in strings using unicode character properties
* 1.38 - { module:object.extractObjectProperty|object.extractObjectProperty} / { module:object.applyPropertyValue|object.applyPropertyValue} to allow for extracting values from arrays, transforming them on a separate process and applying them back
* 1.37 - { module:format.replaceString|format.replaceString} as convenience for replacing only a single string.
* 1.36 - { module:format.replaceStrings|format.replaceStrings} to allow for replacement dictionaries and tuplets
* 1.35 - { module:object.extractObjectProperties|extractObjectProperties} / { module:object.extractObjectProperty|extractObjectProperty} - to do horizontal transposes on objects
* 1.34 - { module:format.mapArrayDomain|format.mapArrayDomain} and add notes in the header of { module:random|random} on using non-uniform distributions.
* 1.33 - Object.augmentInherit and Object.union
* 1.32 - Array.indexify to identify sections within a 1d array into a hierarchy.
* 1.31 - harden Array.transpose for arrays with nulls, and Table.generateTSV
* 1.30 - add Format.wordWrap and Format.lineCount
* 1.29 - Updated TableGenerator.format method
* 1.28 - Sticky table headers for table.render
* 1.27 - Multi-Dimensional arange (initialize array along multiple dimensions)
* 1.26 - Support for file.writeFile and file.writeJSON to append
* 1.25 - Additional chain methods and documentation
* 1.24 - format.stripHtmlTags, TableGenerator.offset, chain.chainFlatMap, chain.chainFilter
* 1.23 - add format.parseNumber and TableGenerator.styleColumn, align group.separateByFields to vega-lite fold transform
* 1.22 - make chain iJavaScript aware, but still able to work outside of Jupyter
* 1.21 - include { module:chain|chain} - simple monoid
* 1.20 - fix vega dependency
* 1.19 - add in { module:describe|describe} and { module:hashMap|hashMap} modules, along with { module:format.limitLines|format.limitLines}
* 1.18 - tie to vega-datasets avoiding esmodules until ijavascript can support them
* 1.17 - provide object.propertyValueSample - as a way to list 'non-empty' property values
* 1.16 - provide file.matchFiles - as a way to find files or directories
* 1.15 - provide { module:object.formatProperties|object.formatProperties} - as a way to quickly convert to string, number, etc.
* 1.14 - provide { module:object.mapProperties|object.mapProperties()} and { module:format.compactNumber|format.compactNumber()}
* 1.13 - provide { module:random|utils.random()} to genrate random values
* 1.12 - provide `utils.table(...)` instead of `new utils.TableGenerator(...)`
* 1.11 - provide { module:aggregate.topValues|topValues} (like top 5, bottom 3)
* 1.10 - provide { module:aggregate.percentile|percentile} (like 50th percentile) aggregates
* 1.9 - allow { TableGenerator#transpose|transposing results} on TableGenerator.
* 1.8 - add in What can I Do tutorial, and { module:object.join|object.join methods}
* 1.7 - revamp of `animation` method for ijs.htmlScript
* 1.6 - add SVG support for rendering SVGs and animations with { module:svg}.
* 1.5 - Add LaTeX / KaTeX support with { module:latex} for rendering Math formulas and PlantUML support for Diagrams
* 1.4 - Add in vega embed, vega mimetypes with { module:vega} and example choropleth tutorial
* 1.3 - Add { module:leaflet|Leaflet} for Maps, allow Vega to use { module:vega.svgFromSpec|explicit specs} (so [Examples can be copied and pasted](https://vega.github.io/vega-lite/examples/), and add in {@link module:ijs.htmlScript|htmlScripts}
-------
# Module Overview
| Export | Description |
|-------------------------------|-----------------------------------------------------------------------------------------------------|
| { module:aggregate} | Aggregate collections or collections of objects (ex: min, max, unique, contains, etc. |
| { module:array} | Massage, sort, reshape arrays. |
| { module:base64} | Convert to and from base64 encoding of strings |
| { module:chain} | Simple wrapper (Monad-ish) that allows for chaining statements together |
| { module:datasets} | Load example <a href="https://github.com/vega/vega-datasets">datasets provided by the vega team</a> |
| { module:describe} | Similar to Pandas describe, provides statistics on a set of values / objects |
| { module:file} | Read and write data/text to files. |
| { module:format} | Formatting and massage data to be legible. |
| { module:group} | Group/Reduce Hierarchies of Object - generating Maps of records ({ SourceMap}) |
| { module:hashMap} | Modify JavaScript HashMaps (ex new Map()) |
| { module:ijs} | Extend iJavaScript to support await, and new types of rendering - like { htmlScript} and markdown|
| { module:latex} | Render Math Notation with <a href="www.latex-project.org">LaTeX<a> and <a href="katex.org">KaTeX</a>|
| { module:leaflet} | Render maps with <a href="leaflet.org">Leaflet</a> |
| { module:object} | Massage and manipulate Objects or Collections of Objects. |
| { module:plantuml} | Render <a href="https://plantuml.com">PlantUML</a> within Jupyter results. |
| { module:random} | Generate random values, pick from arrays, or create natural looking images based on 2d/3d space. |
| { module:set} | Functional Utilities for managing JavaScript Sets - allowing for chaining. |
| { module:svg} | Programmatically create SVGs (either jupyter side for exports, or client side for animations) |
| { module:vega} | Generate Charts / Graphs / Maps with <a href="https://vega.github.io/vega/">Vega</a> and <a href="https://vega.github.io/vega-lite/">Vega-Lite</a> |
| { SourceMap} | SubClass of Maps - generated by the { module:group} and reducible with { module:aggregate}|
| { TableGenerator} | Class that can filter, sorts, manage and then generate HTML, CSV, Markdown, etc. |
-------
# For Example
## Get Sample Data
({ module:datasets|See the DataSets module for more on sample datasets})
({ module:ijs.await|See the ijs module for helpers to use async/await})
```
//-- get the data
utils.ijs.await(async ($$, console) => {
barley = await utils.datasets.fetch('barley.json');
//-- continue to use the barley dataset, or wait to the next cell
});
```
## Group By
Then we can group using a process similar to d3js
({ module:group|see the Group module for more on grouping})
```
//-- get the min max of the types of barley
barleyByVarietySite = utils.group.by(barley, 'variety', 'site')
// SourceMap(10) [Map] {
// 'Manchuria' => SourceMap(6) [Map] {
// 'University Farm' => [ [Object], [Object] ],
// 'Waseca' => [ [Object], [Object] ],
// 'Morris' => [ [Object], [Object] ],
// 'Crookston' => [ [Object], [Object] ],
// 'Grand Rapids' => [ [Object], [Object] ],
// 'Duluth' => [ [Object], [Object] ],
// source: 'site'
// },
// 'Glabron' => SourceMap(6) [Map] {
// 'University Farm' => [ [Object], [Object] ],
// 'Waseca' => [ [Object], [Object] ],
// 'Morris' => [ [Object], [Object] ],
// 'Crookston' => [ [Object], [Object] ],
// 'Grand Rapids' => [ [Object], [Object] ],
// 'Duluth' => [ [Object], [Object] ],
// source: 'site'
// },
// ...
// }
//-- now group by variety and year
barleyByVarietyYear = utils.group.by(barley, 'variety', 'year')
// SourceMap(10) [Map] {
// 'Manchuria' => SourceMap(2) [Map] {
// 1931 => [ [Object], [Object], [Object], [Object], [Object], [Object] ],
// 1932 => [ [Object], [Object], [Object], [Object], [Object], [Object] ],
// source: 'year'
// },
// 'Glabron' => SourceMap(2) [Map] {
// 1931 => [ [Object], [Object], [Object], [Object], [Object], [Object] ],
// 1932 => [ [Object], [Object], [Object], [Object], [Object], [Object] ],
// source: 'year'
// },
// ...
// }
```
## Aggregating
({ module:aggregate|See the Aggregation module for more})
```
utils.group.by(barley, 'variety', 'site')
.reduce((collection) => ({
years: utils.aggregate.extent(collection, 'year'),
numRecords: utils.aggregate.length(collection),
yield_sum: utils.aggregate.sum(collection, 'yield'),
yield_min: utils.aggregate.min(collection, 'yield'),
yield_max: utils.aggregate.max(collection, 'yield'),
yield_diff: utils.aggregate.difference(collection, 'yield')
}));
returns
[
{
variety: 'Manchuria',
site: 'University Farm',
years: { min: 1931, max: 1932 },
numRecords: 2,
yield_sum: 53.9,
yield_min: 26.9,
yield_max: 27,
yield_diff: 0.100
},
{
variety: 'Manchuria',
site: 'Waseca',
years: { min: 1931, max: 1932 },
numRecords: 2,
yield_sum: 82.33333,
yield_min: 33.46667,
yield_max: 48.86667,
yield_diff: 15.39999
},
...
];
```
## Render as a Table
({ TableGenerator|See the TableGenerator class for more})
```
new utils.TableGenerator(barley)
.sort('-yield')
.formatter({ year: (v) => `${v}`})
.limit(10)
.render()
```

## Show a Graph
(See the { vegaLite1} tutorial or the { module:vega|Vega module} for more)
```
//-- make a point chart
utils.vega.svg((vl) => vl.markPoint()
//-- data as an array of items
.data(barley)
.title('Barley Yield by Site')
.width(600)
.encode(
//-- x position is Nominal - not a number
vl.x().fieldN('site'),
//-- y position is Quantitative - a number
vl.y().fieldQ('yield'),
//-- Color is based on the year field
vl.color().fieldN('year')
)
)
```

Where making it into a bar chart, to understand the proportions of varieties grown is simply changing the mark type
```
// change from markPoint to markBar
utils.vega.svg((vl) => vl.markBar()
//-- data as an array of items
.data(barley)
.title('Barley Yield by Site Variety')
.width(600)
.encode(
//-- x position is Nominal - not a number
vl.x().fieldN('site').title('Site'),
//-- y position is Quantitative - a number
vl.y().fieldQ('yield').title('Yield'),
//-- Color is based on the variety field
vl.color().fieldN('variety').title('Variety')
)
)
```

With further options to zoom, pan, or setup interactive sliders:

Or try your hand at the [Vega Lite Examples](https://vega.github.io/vega-lite/examples/) and more from {@link module:vega}

## Create a Data Driven Map
(See the { vega_choroplethShort} tutorial for more)

## Render Maps
({ module:leaflet|See the Leaflet module for more})

## Generate Text Driven Diagrams
({ module:plantuml|See the PlantUML module for more})

## Render Other Libraries
(See the { htmlScript} tutorial for more)
```
utils.ijs.htmlScript({
scripts: ['https://cdnjs.cloudflare.com/ajax/libs/qrcodejs/1.0.0/qrcode.min.js'],
height: '100%',
onReady: ({rootEl}) => {
new QRCode(rootEl, "https://jupyter-ijavascript-utils.onrender.com/");
}
});
```

## Create Animations
(See the { noiseVisualization} tutorial or { module:svg|svg module} for more)


<a name="install"> </a>
# Install
Note that some of the utilities assumes you are running within Jupyter - within n-riesco's iJavaScript kernel (that provides JavaScript language support within Jupyter)
`npm install jupyter-ijavascript-utils`
Depends on:
* [Jupyter Tools - such as Jupyter Lab](https://jupyter.org/)
* [n-riesco/ijavascript jupyter kernel](https://github.com/n-riesco/ijavascript#installation)
See the [How to Use]{ howToUse} section for more.
## ESM Modules + D3
Note that we strongly recommend using this with other modules like D3 - that only support ESM modules now.
There is a [known issue #210](https://github.com/n-riesco/ijavascript/issues/210) in the iJavaScript kernel.
So if you try to import libraries like d3 [and get comments like this](https://github.com/n-riesco/ijavascript/issues/279)
```
$ node -e "import defaultExport from './test.mjs'"
[eval]:1
import defaultExport from './test.mjs'
^^^^^^
SyntaxError: Cannot use import statement outside a module
at new Script (vm.js:88:7)
at createScript (vm.js:263:10)
at Object.runInThisContext (vm.js:311:10)
at Object.<anonymous> ([eval]-wrapper:10:26)
at Module._compile (internal/modules/cjs/loader.js:1151:30)
at evalScript (internal/process/execution.js:94:25)
at internal/main/eval_string.js:23:3
```
### Use [esm-hook](https://github.com/alex-kinokon/esm-hook) as a workaround for now.
```
require("esm-hook"); // must come before requiring esm modules
d3 = require('d3'); // import esm modules
```
More is found on the documentation for [issue #210](require("esm-hook"))