lumenize
Version:
Illuminating the forest AND the trees in your data.
903 lines (852 loc) • 41.6 kB
HTML
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>The source code</title>
<link href="../resources/prettify/prettify.css" type="text/css" rel="stylesheet" />
<script type="text/javascript" src="../resources/prettify/prettify.js"></script>
<style type="text/css">
.highlight { display: block; background-color: #ddd; }
</style>
<script type="text/javascript">
function highlight() {
document.getElementById(location.hash.replace(/#/, "")).className = "highlight";
}
</script>
</head>
<body onload="prettyPrint(); highlight();">
<pre class="prettyprint lang-js">/* <CoffeeScript>
# !TODO: Add summary metrics
# !TODO: Be smart enough to move dependent metrics to the deriveFieldsOnOutput
{utils} = require('tztime')
functions = require('./functions').functions
{arrayOfMaps_To_CSVStyleArray, csvStyleArray_To_ArrayOfMaps} = require('../')
class OLAPCube
</CoffeeScript> */
<span id='Lumenize-OLAPCube-method-constructor'><span id='Lumenize-OLAPCube-cfg-deriveFieldsOnOutput'><span id='Lumenize-OLAPCube-cfg-deriveFieldsOnInput'><span id='Lumenize-OLAPCube-cfg-keepFacts'><span id='Lumenize-OLAPCube-cfg-keepTotals'><span id='Lumenize-OLAPCube-cfg-metrics'><span id='Lumenize-OLAPCube-cfg-field'><span id='Lumenize-OLAPCube-cfg-f'><span id='Lumenize-OLAPCube-cfg-groupBy'><span id='Lumenize-OLAPCube-cfg-dimensions'><span id='Lumenize-OLAPCube'> /**
</span></span></span></span></span></span></span></span></span></span></span> * @class Lumenize.OLAPCube
*
* __An efficient, in-memory, incrementally-updateable, hierarchy-capable OLAP Cube implementation.__
*
* [OLAP Cubes](http://en.wikipedia.org/wiki/OLAP_cube) are a powerful abstraction that makes it easier to do everything
* from simple group-by operations to more complex multi-dimensional and hierarchical analysis. This implementation has
* the same conceptual ancestry as implementations found in business intelligence and OLAP database solutions. However,
* it is meant as a light weight alternative primarily targeting the goal of making it easier for developers to implement
* desired analysis. It also supports serialization and incremental updating so it's ideally
* suited for visualizations and analysis that are updated on a periodic or even continuous basis.
*
* ## Features ##
*
* * In-memory
* * Incrementally-updateable
* * Serialize (`getStateForSaving()`) and deserialize (`newFromSavedState()`) to preserve aggregations between sessions
* * Accepts simple JavaScript Objects as facts
* * Storage and output as simple JavaScript Arrays of Objects
* * Hierarchy (trees) derived from fact data assuming [materialized path](http://en.wikipedia.org/wiki/Materialized_path)
* array model commonly used with NoSQL databases
*
* ## 2D Example ##
*
* Let's walk through a simple 2D example from facts to output. Let's say you have this set of facts:
*
* facts = [
* {ProjectHierarchy: [1, 2, 3], Priority: 1, Points: 10},
* {ProjectHierarchy: [1, 2, 4], Priority: 2, Points: 5 },
* {ProjectHierarchy: [5] , Priority: 1, Points: 17},
* {ProjectHierarchy: [1, 2] , Priority: 1, Points: 3 },
* ]
*
* The ProjectHierarchy field models its hierarchy (tree) as an array containing a
* [materialized path](http://en.wikipedia.org/wiki/Materialized_path). The first fact is "in" Project 3 whose parent is
* Project 2, whose parent is Project 1. The second fact is "in" Project 4 whose parent is Project 2 which still has
* Project 1 as its parent. Project 5 is another root Project like Project 1; and the fourth fact is "in" Project 2.
* So the first fact will roll-up the tree and be aggregated against [1], and [1, 2] as well as [1, 2, 3]. Root Project 1
* will get the data from all but the third fact which will get aggregated against root Project 5.
*
* We specify the ProjectHierarchy field as a dimension of type 'hierarchy' and the Priorty field as a simple value dimension.
*
* dimensions = [
* {field: "ProjectHierarchy", type: 'hierarchy'},
* {field: "Priority"}
* ]
*
* This will create a 2D "cube" where each unique value for ProjectHierarchy and Priority defines a different cell.
* Note, this happens to be a 2D "cube" (more commonly referred to as a [pivot table](http://en.wikipedia.org/wiki/Pivot_Table)),
* but you can also have a 1D cube (a simple group-by), a 3D cube, or even an n-dimensional hypercube where n is greater than 3.
*
* You can specify any number of metrics to be calculated for each cell in the cube.
*
* metrics = [
* {field: "Points", f: "sum", as: "Scope"}
* ]
*
* You can use any of the aggregation functions found in Lumenize.functions except `count`. The count metric is
* automatically tracked for each cell. The `as` specification is optional unless you provide a custom function. If missing,
* it will build the name of the resulting metric from the field name and the function. So without the `as: "Scope"` the
* second metric in the example above would have been named "Points_sum".
*
* You can also use custom functions in the form of `f(values) -> return <some function of values>`.
*
* Next, we build the config parameter from our dimension and metrics specifications.
*
* config = {dimensions, metrics}
*
* Hierarchy dimensions automatically roll up but you can also tell it to keep all totals by setting config.keepTotals to
* true. The totals are then kept in the cells where one or more of the dimension values are set to `null`. Note, you
* can also set keepTotals for individual dimension and should probably use that if you have more than a few dimensions
* but we're going to set it globally here:
*
* config.keepTotals = true
*
* Now, let's create the cube.
*
* {OLAPCube} = require('../')
* cube = new OLAPCube(config, facts)
*
* `getCell()` allows you to extract a single cell. The "total" cell for all facts where Priority = 1 can be found as follows:
*
* console.log(cube.getCell({Priority: 1}))
* # { ProjectHierarchy: null, Priority: 1, _count: 3, Scope: 30 }
*
* Notice how the ProjectHierarchy field value is `null`. This is because it is a total cell for Priority dimension
* for all ProjectHierarchy values. Think of `null` values in this context as wildcards.
*
* Similarly, we can get the total for all descendants of ProjectHierarchy = [1] regarless of Priority as follows:
*
* console.log(cube.getCell({ProjectHierarchy: [1]}))
* # { ProjectHierarchy: [ 1 ], Priority: null, _count: 3, Scope: 18 }
*
* `getCell()` uses the cellIndex so it's very efficient. Using `getCell()` and `getDimensionValues()`, you can iterate
* over a slice of the OLAPCube. It is usually preferable to access the cells in place like this rather than the
* traditional OLAP approach of extracting a slice for processing. However, there is a `slice()` method for extracting
* a 2D slice.
*
* rowValues = cube.getDimensionValues('ProjectHierarchy')
* columnValues = cube.getDimensionValues('Priority')
* s = OLAPCube._padToWidth('', 7) + ' | '
* s += ((OLAPCube._padToWidth(JSON.stringify(c), 7) for c in columnValues).join(' | '))
* s += ' | '
* console.log(s)
* for r in rowValues
* s = OLAPCube._padToWidth(JSON.stringify(r), 7) + ' | '
* for c in columnValues
* cell = cube.getCell({ProjectHierarchy: r, Priority: c})
* if cell?
* cellString = JSON.stringify(cell._count)
* else
* cellString = ''
* s += OLAPCube._padToWidth(cellString, 7) + ' | '
* console.log(s)
* # | null | 1 | 2 |
* # null | 4 | 3 | 1 |
* # [1] | 3 | 2 | 1 |
* # [1,2] | 3 | 2 | 1 |
* # [1,2,3] | 1 | 1 | |
* # [1,2,4] | 1 | | 1 |
* # [5] | 1 | 1 | |
*
* Or you can just call `toString()` method which extracts a 2D slice for tabular display. Both approachs will work on
* cubes of any number of dimensions two or greater. The manual example above extracted the `count` metric. We'll tell
* the example below to extract the `Scope` metric.
*
* console.log(cube.toString('ProjectHierarchy', 'Priority', 'Scope'))
* # | || Total | 1 2|
* # |==============================|
* # |Total || 35 | 30 5|
* # |------------------------------|
* # |[1] || 18 | 13 5|
* # |[1,2] || 18 | 13 5|
* # |[1,2,3] || 10 | 10 |
* # |[1,2,4] || 5 | 5|
* # |[5] || 17 | 17 |
*
* ## Dimension types ##
*
* The following dimension types are supported:
*
* 1. Single value
* * Number
* * String
* * Does not work:
* * Boolean - known to fail
* * Object - may sorta work but sort-order at least is not obvious
* * Date - not tested but may actually work
* 2. Arrays as materialized path for hierarchical (tree) data
* 3. Non-hierarchical Arrays ("tags")
*
* There is no need to tell the OLAPCube what type to use with the exception of #2. In that case, add `type: 'hierarchy'`
* to the dimensions row like this:
*
* dimensions = [
* {field: 'hierarchicalDimensionField', type: 'hierarchy'} #, ...
* ]
*
* ## Hierarchical (tree) data ##
*
* This OLAP Cube implementation assumes your hierarchies (trees) are modeled as a
* [materialized path](http://en.wikipedia.org/wiki/Materialized_path) array. This approach is commonly used with NoSQL databases like
* [CouchDB](http://probablyprogramming.com/2008/07/04/storing-hierarchical-data-in-couchdb) and
* [MongoDB (combining materialized path and array of ancestors)](http://docs.mongodb.org/manual/tutorial/model-tree-structures/)
* and even SQL databases supporting array types like [Postgres](http://www.monkeyandcrow.com/blog/hierarchies_with_postgres/).
*
* This approach differs from the traditional OLAP/MDX fixed/named level hierarchy approach. In that approach, you assume
* that the number of levels in the hierarchy are fixed. Also, each level in the hierarchy is either represented by a different
* column (clothing example --> level 0: SEX column - mens vs womens; level 1: TYPE column - pants vs shorts vs shirts; etc.) or
* predetermined ranges of values in a single field (date example --> level 0: year; level 1: quarter; level 2: month; etc.)
*
* However, the approach used by this OLAPCube implementaion is the more general case, because it can easily simulate
* fixed/named level hierachies whereas the reverse is not true. In the clothing example above, you would simply key
* your dimension off of a derived field that was a combination of the SEX and TYPE columns (e.g. ['mens', 'pants'])
*
* ## Date/Time hierarchies ##
*
* Lumenize is designed to work well with the tzTime library. Here is an example of taking a bunch of ISOString data
* and doing timezone precise hierarchical roll up based upon the date segments (year, month).
*
* data = [
* {date: '2011-12-31T12:34:56.789Z', value: 10},
* {date: '2012-01-05T12:34:56.789Z', value: 20},
* {date: '2012-01-15T12:34:56.789Z', value: 30},
* {date: '2012-02-01T00:00:01.000Z', value: 40},
* {date: '2012-02-15T12:34:56.789Z', value: 50},
* ]
*
* {Time} = require('../')
*
* config =
* deriveFieldsOnInput: [{
* field: 'dateSegments',
* f: (row) ->
* return new Time(row.date, Time.MONTH, 'America/New_York').getSegmentsAsArray()
* }]
* metrics: [{field: 'value', f: 'sum'}]
* dimensions: [{field: 'dateSegments', type: 'hierarchy'}]
*
* cube = new OLAPCube(config, data)
* console.log(cube.toString(undefined, undefined, 'value_sum'))
* # | dateSegments | value_sum |
* # |==========================|
* # | [2011] | 10 |
* # | [2011,12] | 10 |
* # | [2012] | 140 |
* # | [2012,1] | 90 |
* # | [2012,2] | 50 |
*
* Notice how '2012-02-01T00:00:01.000Z' got bucketed in January because the calculation was done in timezone
* 'America/New_York'.
*
* ## Non-hierarchical Array fields ##
*
* If you don't specify type: 'hierarchy' and the OLAPCube sees a field whose value is an Array in a dimension field, the
* data in that fact would get aggregated against each element in the Array. So a non-hierarchical Array field like
* ['x', 'y', 'z'] would get aggregated against 'x', 'y', and 'z' rather than ['x'], ['x', 'y'], and ['x','y','z]. This
* functionality is useful for accomplishing analytics on tags, but it can be used in other powerful ways. For instance
* let's say you have a list of events:
*
* events = [
* {name: 'Renaissance Festival', activeMonths: ['September', 'October']},
* {name: 'Concert Series', activeMonths: ['July', 'August', 'September']},
* {name: 'Fall Festival', activeMonths: ['September']}
* ]
*
* You could figure out the number of events active in each month by specifying "activeMonths" as a dimension.
* Lumenize.TimeInStateCalculator (and other calculators in Lumenize) use this technique.
*
* @constructor
* @param {Object} config See Config options for details. DO NOT change the config settings after the OLAP class is instantiated.
* @param {Object[]} [facts] Optional parameter allowing the population of the OLAPCube with an intitial set of facts
* upon instantiation. Use addFacts() to add facts after instantiation.
* @cfg {Object[]} dimensions Array which specifies the fields to use as dimension fields. If the field contains a
* hierarchy array, say so in the row, (e.g. `{field: 'SomeFieldName', type: 'hierarchy'}`). Any array values that it
* finds in the supplied facts will be assumed to be tags rather than a hierarchy specification unless `type: 'hierarchy'`
* is specified.
*
* For example, let's say you have a set of facts that look like this:
*
* fact = {
* dimensionField: 'a',
* hierarchicalDimensionField: ['1','2','3'],
* tagDimensionField: ['x', 'y', 'z'],
* valueField: 10
* }
*
* Then a set of dimensions like this makes sense.
*
* config.dimensions = [
* {field: 'dimensionField'},
* {field: 'hierarchicalDimensionField', type: 'hierarchy'},
* {field: 'tagDimensionField', keepTotals: true}
* ]
*
* Notice how a keepTotals can be set for an individual dimension. This is preferable to setting it for the entire
* cube in cases where you don't want totals in all dimensions.
*
* If no dimension config is provided, then you must use syntactic sugar like groupBy.
*
* @cfg {String} [groupBy] Syntactic sugar for single-dimension/single-metric usage.
* @cfg {String} [f] Syntactic sugar for single-dimension/single-metric usage. If provided, you must also provide
* a `groupBy` config. If you provided a `groupBy` but no `f` or `field`, then the default `count` metric will be used.
* @cfg {String} [field] Syntactic sugar for single-dimension/single-metric usage. If provided, you must also provide
* a `groupBy` config. If you provided a `groupBy` but no `f` or `field`, then the default `count` metric will be used.
*
* @cfg {Object[]} [metrics=[]] Array which specifies the metrics to calculate for each cell in the cube.
*
* Example:
*
* config = {}
* config.metrics = [
* {field: 'field3'}, # defaults to metrics: ['sum']
* {field: 'field4', metrics: [
* {f: 'sum'}, # will add a metric named field4_sum
* {as: 'median4', f: 'p50'}, # renamed p50 to median4 from default of field4_p50
* {as: 'myCount', f: (values) -> return values.length} # user-supplied function
* ]}
* ]
*
* If you specify a field without any metrics, it will assume you want the sum but it will not automatically
* add the sum metric to fields with a metrics specification. User-supplied aggregation functions are also supported as
* shown in the 'myCount' metric above.
*
* Note, if the metric has dependencies (e.g. average depends upon count and sum) it will automatically add those to
* your metric definition. If you've already added a dependency but put it under a different "as", it's not smart
* enough to sense that and it will add it again. Either live with the slight inefficiency and duplication or leave
* dependency metrics named their default by not providing an "as" field.
*
* @cfg {Boolean} [keepTotals=false] Setting this will add an additional total row (indicated with field: null) along
* all dimensions. This setting can have an impact on the memory usage and performance of the OLAPCube so
* if things are tight, only use it if you really need it. If you don't need it for all dimension, you can specify
* keepTotals for individual dimensions.
* @cfg {Boolean} [keepFacts=false] Setting this will cause the OLAPCube to keep track of the facts that contributed to
* the metrics for each cell by adding an automatic 'facts' metric. Note, facts are restored after deserialization
* as you would expect, but they are no longer tied to the original facts. This feature, especially after a restore
* can eat up memory.
* @cfg {Object[]} [deriveFieldsOnInput] An Array of Maps in the form `{field:'myField', f:(fact)->...}`
* @cfg {Object[]} [deriveFieldsOnOutput] same format as deriveFieldsOnInput, except the callback is in the form `f(row)`
* This is only called for dirty rows that were effected by the latest round of addFacts. It's more efficient to calculate things
* like standard deviation and percentile coverage here than in config.metrics. You just have to remember to include the dependencies
* in config.metrics. Standard deviation depends upon `sum` and `sumSquares`. Percentile coverage depends upon `values`.
* In fact, if you are going to capture values anyway, all of the functions are most efficiently calculated here.
* Maybe some day, I'll write the code to analyze your metrics and move them out to here if it improves efficiency.
*/
/* <CoffeeScript>
Maybe some day, I'll write the code to analyze your metrics and move them out to here if it improves efficiency.
@config = utils.clone(@userConfig)
@cells = []
@cellIndex = {}
@currentValues = {}
# Syntactic sugar for groupBy
if @config.groupBy?
@config.dimensions = [{field: @config.groupBy}]
if @config.f? and @config.field?
@config.metrics = [{field: @config.field, f: @config.f}]
utils.assert(@config.dimensions?, 'Must provide config.dimensions.')
unless @config.metrics?
@config.metrics = []
@_dimensionValues = {} # key: fieldName, value: {} where key: uniqueValue, value: the real key (not stringified)
for d in @config.dimensions
@_dimensionValues[d.field] = {}
unless @config.keepTotals
@config.keepTotals = false
unless @config.keepFacts
@config.keepFacts = false
for d in @config.dimensions
if @config.keepTotals or d.keepTotals
d.keepTotals = true
else
d.keepTotals = false
functions.expandMetrics(@config.metrics, true, true)
# Set required fields
requiredFieldsObject = {}
for m in @config.metrics
if m.field?.length > 0 # Should only be false if function is count
requiredFieldsObject[m.field] = null
@requiredMetricsFields = (key for key, value of requiredFieldsObject)
requiredFieldsObject = {}
for d in @config.dimensions
requiredFieldsObject[d.field] = null
@requiredDimensionFields = (key for key, value of requiredFieldsObject)
@summaryMetrics = {}
@addFacts(facts)
@_possibilities: (key, type, keepTotals) ->
switch utils.type(key)
when 'array'
if keepTotals
a = [null]
else
a = []
if type == 'hierarchy'
len = key.length
while len > 0
a.push(key.slice(0, len))
len--
else # assume it's a tag array
if keepTotals
a = [null].concat(key)
else
a = key
return a
when 'string', 'number'
if keepTotals
return [null, key]
else
return [key]
@_decrement: (a, rollover) ->
i = a.length - 1
a[i]--
while a[i] < 0
a[i] = rollover[i]
i--
if i < 0
return false
else
a[i]--
return true
_expandFact: (fact) ->
possibilitiesArray = []
countdownArray = []
rolloverArray = []
for d in @config.dimensions
p = OLAPCube._possibilities(fact[d.field], d.type, d.keepTotals)
if p is undefined
console.log(fact)
possibilitiesArray.push(p)
countdownArray.push(p.length - 1)
rolloverArray.push(p.length - 1) # !TODO: If I need some speed, we could calculate the rolloverArray once and make a copy to the countdownArray for each run
for m in @config.metrics
@currentValues[m.field] = [fact[m.field]] # !TODO: Add default values here. I think this is the only place it is needed. write tests with incremental update to confirm.
out = []
more = true
while more
outRow = {}
for d, index in @config.dimensions
outRow[d.field] = possibilitiesArray[index][countdownArray[index]]
outRow._count = 1
if @config.keepFacts
outRow._facts = [fact]
for m in @config.metrics
outRow[m.as] = m.f([fact[m.field]], undefined, undefined, outRow, m.field + '_')
out.push(outRow)
more = OLAPCube._decrement(countdownArray, rolloverArray)
return out
@_extractFilter: (row, dimensions) ->
out = {}
for d in dimensions
out[d.field] = row[d.field]
return out
_mergeExpandedFactArray: (expandedFactArray) ->
for er in expandedFactArray
# set _dimensionValues
for d in @config.dimensions
fieldValue = er[d.field]
@_dimensionValues[d.field][JSON.stringify(fieldValue)] = fieldValue
# start merge
filterString = JSON.stringify(OLAPCube._extractFilter(er, @config.dimensions))
olapRow = @cellIndex[filterString]
if olapRow?
for m in @config.metrics
olapRow[m.as] = m.f(olapRow[m.field + '_values'], olapRow[m.as], @currentValues[m.field], olapRow, m.field + '_')
if olapRow._facts?
olapRow._facts = olapRow._facts.concat(er._facts)
else
olapRow = er
@cellIndex[filterString] = olapRow
@cells.push(olapRow)
@dirtyRows[filterString] = olapRow
addFacts: (facts) ->
</CoffeeScript> */
<span id='Lumenize-OLAPCube-method-addFacts'> /**
</span> * @method addFacts
* @member Lumenize.OLAPCube
* Adds facts to the OLAPCube.
*
* @chainable
* @param {Object[]} facts An Array of facts to be aggregated into OLAPCube. Each fact is a Map where the keys are the field names
* and the values are the field values (e.g. `{field1: 'a', field2: 5}`).
*/
/* <CoffeeScript>
@dirtyRows = {}
if utils.type(facts) == 'array'
if facts.length <= 0
return
else
if facts?
facts = [facts]
else
return
if @config.deriveFieldsOnInput
for fact in facts
for d in @config.deriveFieldsOnInput
if d.as?
fieldName = d.as
else
fieldName = d.field
fact[fieldName] = d.f(fact)
for fact in facts
@addMissingFields(fact)
@currentValues = {}
expandedFactArray = @_expandFact(fact)
@_mergeExpandedFactArray(expandedFactArray)
# deriveFieldsOnOutput for @dirtyRows
if @config.deriveFieldsOnOutput?
for filterString, dirtyRow of @dirtyRows
for d in @config.deriveFieldsOnOutput
if d.as?
fieldName = d.as
else
fieldName = d.field
dirtyRow[fieldName] = d.f(dirtyRow)
@dirtyRows = {}
return this
addMissingFields: (fact) ->
for field in @requiredMetricsFields
if fact[field] is undefined
fact[field] = null
for field in @requiredDimensionFields
unless fact[field]?
fact[field] = '<missing>'
return fact
getCells: (filterObject) ->
</CoffeeScript> */
<span id='Lumenize-OLAPCube-method-getCells'> /**
</span> * @method getCells
* @member Lumenize.OLAPCube
* Returns a subset of the cells that match the supplied filter. You can perform slice and dice operations using
* this. If you have criteria for all of the dimensions, you are better off using `getCell()`. Most times, it's
* better to iterate over the unique values for the dimensions of interest using `getCell()` in place of slice or
* dice operations. However, there is a `slice()` method for extracting a 2D slice
* @param {Object} [filterObject] Specifies the constraints that the returned cells must match in the form of
* `{field1: value1, field2: value2}`. If this parameter is missing, the internal cells array is returned.
* @return {Object[]} Returns the cells that match the supplied filter
*/
/* <CoffeeScript>
unless filterObject?
return @cells
output = []
for c in @cells
if utils.filterMatch(filterObject, c)
output.push(c)
return output
getCell: (filter, defaultValue) ->
</CoffeeScript> */
<span id='Lumenize-OLAPCube-method-getCell'> /**
</span> * @method getCell
* @member Lumenize.OLAPCube
* Returns the single cell matching the supplied filter. Iterating over the unique values for the dimensions of
* interest, you can incrementally retrieve a slice or dice using this method. Since `getCell()` always uses an index,
* in most cases, this is better than using `getCells()` to prefetch a slice or dice.
* @param {Object} [filter={}] Specifies the constraints for the returned cell in the form of `{field1: value1, field2: value2}.
* Any fields that are specified in config.dimensions that are missing from the filter are automatically filled in
* with null. Calling `getCell()` with no parameter or `{}` will return the total of all dimensions (if @config.keepTotals=true).
* @return {Object[]} Returns the cell that match the supplied filter
*/
/* <CoffeeScript>
unless filter?
filter = {}
for key, value of filter
foundIt = false
for d in @config.dimensions
if d.field == key
foundIt = true
unless foundIt
throw new Error("#{key} is not a dimension for this cube.")
normalizedFilter = {}
for d in @config.dimensions
if filter.hasOwnProperty(d.field)
normalizedFilter[d.field] = filter[d.field]
else
if d.keepTotals
normalizedFilter[d.field] = null
else
throw new Error('Must set keepTotals to use getCell with a partial filter.')
cell = @cellIndex[JSON.stringify(normalizedFilter)]
if cell?
return cell
else
return defaultValue
getDimensionValues: (field, descending = false) ->
</CoffeeScript> */
<span id='Lumenize-OLAPCube-method-getDimensionValues'> /**
</span> * @method getDimensionValues
* @member Lumenize.OLAPCube
* Returns the unique values for the specified dimension in sort order.
* @param {String} field The field whose values you want
* @param {Boolean} [descending=false] Set to true if you want them in reverse order
*/
/* <CoffeeScript>
values = utils.values(@_dimensionValues[field])
values.sort(utils.compare)
unless descending
values.reverse()
return values
@roundToSignificance: (value, significance) ->
unless significance?
return value
multiple = 1 / significance
return Math.floor(value * multiple) / multiple
toString: (rows, columns, metric, significance) ->
</CoffeeScript> */
<span id='Lumenize-OLAPCube-method-toString'> /**
</span> * @method toString
* @member Lumenize.OLAPCube
* Produces a printable table with the first dimension as the rows, the second dimension as the columns, and the count
* as the values in the table.
* @return {String} A string which will render as a table when written to the console.
* @param {String} [rows=<first dimension>]
* @param {String} [columns=<second dimension>]
* @param {String} [metric='count']
* @param {Number} [significance] The multiple to which you want to round the bucket edges. 1 means whole numbers.
* 0.1 means to round to tenths. 0.01 to hundreds. Etc.
*/
/* <CoffeeScript>
unless metric?
metric = '_count'
if @config.dimensions.length == 1
return @toStringOneDimension(@config.dimensions[0].field, metric, significance)
else
return @toStringTwoDimensions(rows, columns, metric, significance)
toStringOneDimension: (field, metric, significance) ->
rowValues = @getDimensionValues(field)
rowValueStrings = (JSON.stringify(r) for r in rowValues)
rowLabelWidth = Math.max.apply({}, (s.length for s in rowValueStrings))
rowLabelWidth = Math.max(rowLabelWidth, 'Total'.length, field.length)
maxColumnWidth = metric.length
valueStrings = []
for r, indexRow in rowValues
filter = {}
filter[field] = r
cell = @getCell(filter)
if cell?
cellString = JSON.stringify(OLAPCube.roundToSignificance(cell[metric], significance))
else
cellString = ''
maxColumnWidth = Math.max(maxColumnWidth, cellString.length)
valueStrings.push(cellString)
maxColumnWidth += 1
fullWidth = rowLabelWidth + maxColumnWidth + 4
s = '| ' + (OLAPCube._padToWidth(field, rowLabelWidth, ' ', true)) + ' |'
s += OLAPCube._padToWidth(metric, maxColumnWidth) + ' |'
s += '\n|' + OLAPCube._padToWidth('', fullWidth, '=') + '|'
for r, indexRow in rowValueStrings
s += '\n| '
if r == 'null'
s += OLAPCube._padToWidth('Total', rowLabelWidth, ' ', true)
else
s += OLAPCube._padToWidth(r, rowLabelWidth, ' ', true)
s += ' |' + OLAPCube._padToWidth(valueStrings[indexRow], maxColumnWidth) + ' |'
if r == 'null'
s += '\n|' + OLAPCube._padToWidth('', fullWidth, '-') + '|'
return s
toStringTwoDimensions: (rows, columns, metric, significance) ->
unless rows?
rows = @config.dimensions[0].field
unless columns?
columns = @config.dimensions[1].field
rowValues = @getDimensionValues(rows)
columnValues = @getDimensionValues(columns)
rowValueStrings = (JSON.stringify(r) for r in rowValues)
columnValueStrings = (JSON.stringify(c) for c in columnValues)
rowLabelWidth = Math.max.apply({}, (s.length for s in rowValueStrings))
rowLabelWidth = Math.max(rowLabelWidth, 'Total'.length)
valueStrings = []
maxColumnWidth = Math.max.apply({}, (s.length for s in columnValueStrings))
maxColumnWidth = Math.max(maxColumnWidth, 'Total'.length)
for r, indexRow in rowValues
valueStringsRow = []
for c, indexColumn in columnValues
filter = {}
filter[rows] = r
filter[columns] = c
cell = @getCell(filter)
if cell?
cellString = JSON.stringify(OLAPCube.roundToSignificance(cell[metric], significance))
else
cellString = ''
maxColumnWidth = Math.max(maxColumnWidth, cellString.length)
valueStringsRow.push(cellString)
valueStrings.push(valueStringsRow)
maxColumnWidth += 1
s = '|' + (OLAPCube._padToWidth('', rowLabelWidth)) + ' ||'
for c, indexColumn in columnValueStrings
if c == 'null'
s += OLAPCube._padToWidth('Total', maxColumnWidth) + ' |'
else
s += OLAPCube._padToWidth(c, maxColumnWidth)
fullWidth = rowLabelWidth + maxColumnWidth * columnValueStrings.length + 3
if columnValueStrings[0] == 'null'
fullWidth += 2
s += '|\n|' + OLAPCube._padToWidth('', fullWidth, '=')
for r, indexRow in rowValueStrings
s += '|\n|'
if r == 'null'
s += OLAPCube._padToWidth('Total', rowLabelWidth, ' ', true)
else
s += OLAPCube._padToWidth(r, rowLabelWidth, ' ', true)
s += ' ||'
for c, indexColumn in columnValueStrings
s += OLAPCube._padToWidth(valueStrings[indexRow][indexColumn], maxColumnWidth)
if c == 'null'
s += ' |'
if r == 'null'
s += '|\n|' + OLAPCube._padToWidth('', fullWidth, '-')
s += '|'
return s
slice: (rows, columns, metric, significance) ->
</CoffeeScript> */
<span id='Lumenize-OLAPCube-method-slice'> /**
</span> * @method slice
* @member Lumenize.OLAPCube
* Extracts a 2D slice of the data. It outputs an array of arrays (JavaScript two-dimensional array) organized as the
* C3 charting library would expect if submitting row-oriented data.
*
* Note, the calling parameters and output of this function are very similar to the 2D toString() function output
* except the data is organized as a two-dimensional array instead of newline-separated lines and the cells are
* filled with actual values instead of padded string representations of those values.
* @return {[[]]} An array of arrays with the one row for the header and each row label
* @param {String} [rows=<first dimension>]
* @param {String} [columns=<second dimension>]
* @param {String} [metric='count']
* @param {Number} [significance] The multiple to which you want to round the bucket edges. 1 means whole numbers.
* 0.1 means to round to tenths. 0.01 to hundreds. Etc.
*/
/* <CoffeeScript>
unless rows?
rows = @config.dimensions[0].field
unless columns?
columns = @config.dimensions[1].field
unless metric?
metric = '_count'
rowValues = @getDimensionValues(rows)
columnValues = @getDimensionValues(columns)
values = []
topRow = []
topRow.push(rows)
for c, indexColumn in columnValues
if c is null
topRow.push('Total')
else
topRow.push(c)
values.push(topRow)
for r, indexRow in rowValues
valuesRow = []
if r is null
valuesRow.push('Total')
else
valuesRow.push(r)
for c, indexColumn in columnValues
filter = {}
filter[rows] = r
filter[columns] = c
cell = @getCell(filter)
if cell?
cellValue = OLAPCube.roundToSignificance(cell[metric], significance)
else
cellValue = null
valuesRow.push(cellValue)
values.push(valuesRow)
return values
@_padToWidth: (s, width, padCharacter = ' ', rightPad = false) ->
if s.length > width
return s.substr(0, width)
padding = new Array(width - s.length + 1).join(padCharacter)
if rightPad
return s + padding
else
return padding + s
getStateForSaving: (meta) ->
</CoffeeScript> */
<span id='Lumenize-OLAPCube-method-getStateForSaving'> /**
</span> * @method getStateForSaving
* @member Lumenize.OLAPCube
* Enables saving the state of an OLAPCube.
* @param {Object} [meta] An optional parameter that will be added to the serialized output and added to the meta field
* within the deserialized OLAPCube
* @return {Object} Returns an Ojbect representing the state of the OLAPCube. This Object is suitable for saving to
* to an object store. Use the static method `newFromSavedState()` with this Object as the parameter to reconstitute the OLAPCube.
*
* facts = [
* {ProjectHierarchy: [1, 2, 3], Priority: 1},
* {ProjectHierarchy: [1, 2, 4], Priority: 2},
* {ProjectHierarchy: [5] , Priority: 1},
* {ProjectHierarchy: [1, 2] , Priority: 1},
* ]
*
* dimensions = [
* {field: "ProjectHierarchy", type: 'hierarchy'},
* {field: "Priority"}
* ]
*
* config = {dimensions, metrics: []}
* config.keepTotals = true
*
* originalCube = new OLAPCube(config, facts)
*
* dateString = '2012-12-27T12:34:56.789Z'
* savedState = originalCube.getStateForSaving({upToDate: dateString})
* restoredCube = OLAPCube.newFromSavedState(savedState)
*
* newFacts = [
* {ProjectHierarchy: [5], Priority: 3},
* {ProjectHierarchy: [1, 2, 4], Priority: 1}
* ]
* originalCube.addFacts(newFacts)
* restoredCube.addFacts(newFacts)
*
* console.log(restoredCube.toString() == originalCube.toString())
* # true
*
* console.log(restoredCube.meta.upToDate)
* # 2012-12-27T12:34:56.789Z
*/
/* <CoffeeScript>
out =
config: @userConfig
# cellsAsCSVStyleArray: arrayOfMaps_To_CSVStyleArray(@cells)
cells: @cells
summaryMetrics: @summaryMetrics
if meta?
out.meta = meta
return out
@newFromSavedState: (p) ->
</CoffeeScript> */
<span id='Lumenize-OLAPCube-static-method-newFromSavedState'> /**
</span> * @method newFromSavedState
* @member Lumenize.OLAPCube
* Deserializes a previously stringified OLAPCube and returns a new OLAPCube.
*
* See `getStateForSaving()` documentation for a detailed example.
*
* Note, if you have specified config.keepFacts = true, the values for the facts will be restored, however, they
* will no longer be references to the original facts. For this reason, it's usually better to include a `values` or
* `uniqueValues` metric on some ID field if you want fact drill-down support to survive a save and restore.
* @static
* @param {String/Object} p A String or Object from a previously saved OLAPCube state
* @return {OLAPCube}
*/
/* <CoffeeScript>
if utils.type(p) is 'string'
p = JSON.parse(p)
cube = new OLAPCube(p.config)
cube.summaryMetrics = p.summaryMetrics
if p.meta?
cube.meta = p.meta
if p.cellsAsCSVStyleArray?
cube.cells = csvStyleArray_To_ArrayOfMaps(p.cellsAsCSVStyleArray)
else
cube.cells = p.cells
cube.cellIndex = {}
cube._dimensionValues = {}
for d in cube.config.dimensions
cube._dimensionValues[d.field] = {}
for c in cube.cells
filterString = JSON.stringify(OLAPCube._extractFilter(c, cube.config.dimensions))
# rebuild cellIndex
cube.cellIndex[filterString] = c
# rebuild _dimensionValues
for d in cube.config.dimensions
fieldValue = c[d.field]
cube._dimensionValues[d.field][JSON.stringify(fieldValue)] = fieldValue
return cube
exports.OLAPCube = OLAPCube
</CoffeeScript> */</pre>
</body>
</html>