UNPKG

@eagleoutice/flowr

Version:

Static Dataflow Analyzer and Program Slicer for the R Programming Language

161 lines (160 loc) 7.68 kB
import type { PipelineStepName } from './steps/pipeline-step'; import { PipelineStepStage } from './steps/pipeline-step'; import type { Pipeline, PipelineInput, PipelineOutput, PipelinePerRequestInput, PipelineStepOutputWithName } from './steps/pipeline/pipeline'; /** * The pipeline executor allows to execute arbitrary {@link Pipeline|pipelines} in a step-by-step fashion. * If you are not yet in the possession of a {@link Pipeline|pipeline}, you can use the {@link createPipeline} function * to create one for yourself, based on the steps that you want to execute. * * Those steps are split into two phases or "stages" (which is the name that we will use in the following), represented * by the {@link PipelineStepStage} type. These allow us to separate things that have to be done * once per-file, e.g., actually parsing the AST, from those that we need to repeat 'once per request' (whatever this * request may be). In other words, what can be cached between operations and what cannot. * * Furthermore, this executor follows an iterable fashion to be *as flexible as possible* * (e.g., to be instrumented with measurements). So, you can use the pipeline executor like this: * * ```ts * const stepper = new PipelineExecutor( ... ) * while(stepper.hasNextStep()) { * await stepper.nextStep() * } * * stepper.switchToRequestStage() * * while(stepper.hasNextStep()) { * await stepper.nextStep() * } * * const result = stepper.getResults() * ``` * * Of course, you might think, that this is rather overkill if you simply want to receive the result. * And this is true. Therefore, if you do not want to perform some kind of magic in-between steps, you can use the * **{@link allRemainingSteps}** function like this: * * ```ts * const stepper = new PipelineExecutor( ... ) * const result = await stepper.allRemainingSteps() * ``` * * As the name suggests, you can combine this name with previous calls to {@link nextStep} to only execute the remaining * steps in case, for whatever reason you only want to instrument some steps. * * By default, the {@link PipelineExecutor} does not offer an automatic way to repeat requests (mostly to prevent accidental errors). * However, you can use the * **{@link updateRequest}** function to reset the request steps and re-execute them for a new request. This allows something like the following: * * ```ts * const stepper = new PipelineExecutor( ... ) * const result = await stepper.allRemainingSteps() * * stepper.updateRequest( ... ) * const result2 = await stepper.allRemainingSteps() * ``` * * **Example - Slicing With the Pipeline Executor**: * * Suppose, you want to... you know _slice_ a file (which was, at one point the origin of flowR), then you can * either create a pipeline yourself with the respective steps, or you can use the {@link DEFAULT_SLICING_PIPELINE} (and friends). * With it, slicing essentially becomes 'easy-as-pie': * * ```ts * const slicer = new PipelineExecutor(DEFAULT_SLICING_PIPELINE, { * parser: new RShell(), * // of course, the criterion and request given here are just examples, you can use whatever you want to slice! * criterion: ['2@b'], * request: requestFromInput('b <- 3; x <- 5\ncat(b)'), * }) * const result = await slicer.allRemainingSteps() * ``` * * But now, we want to slice for `x` in the first line as well! We can do that by adding: * * ```ts * stepper.updateRequest({ criterion: ['1@x'] }) * const result2 = await stepper.allRemainingSteps() * ``` * * @note Even though using the pipeline executor introduces a small performance overhead, we consider * it to be the baseline for performance benchmarking. It may very well be possible to squeeze out a little bit more by * directly constructing the steps in the right order. However, we consider this to be negligible when compared with the time required * for, for example, the dataflow analysis of larger files. * * @see PipelineExecutor#allRemainingSteps * @see PipelineExecutor#nextStep */ export declare class PipelineExecutor<P extends Pipeline> { private readonly pipeline; private readonly length; private input; private output; private currentExecutionStage; private stepCounter; /** * Construct a new pipeline executor. * The required additional input is specified by the {@link IPipelineStep#requiredInput|required input configuration} of each step in the `pipeline`. * * Please see {@link createDataflowPipeline} and friends for engine agnostic shortcuts to create a pipeline executor. * * @param pipeline - The {@link Pipeline} to execute, probably created with {@link createPipeline}. * @param input - External {@link PipelineInput|configuration and input} required to execute the given pipeline. */ constructor(pipeline: P, input: PipelineInput<P>); /** * Retrieve the {@link Pipeline|pipeline} that is currently being. */ getPipeline(): P; /** * Retrieve the current {@link PipelineStepStage|stage} the pipeline executor is in. * * @see currentExecutionStage * @see switchToRequestStage * @see PipelineStepStage */ getCurrentStage(): PipelineStepStage; /** * Switch to the next {@link PipelineStepStage|stage} of the pipeline executor. * * This will fail if either a step change is currently not valid (as not all steps have been executed), * or if there is no next stage (i.e., the pipeline is already completed or in the last stage). * * @see PipelineExecutor * @see getCurrentStage */ switchToRequestStage(): void; getResults(intermediate?: false): PipelineOutput<P>; getResults(intermediate: true): Partial<PipelineOutput<P>>; getResults(intermediate: boolean): PipelineOutput<P>; /** * Returns true only if * 1) there are more {@link IPipelineStep|steps} to-do for the current {@link PipelineStepStage|stage} and * 2) we have not yet reached the end of the {@link Pipeline|pipeline}. */ hasNextStep(): boolean; /** * Execute the next {@link IPipelineStep|step} and return the name of the {@link IPipelineStep|step} that was executed, * so you can guard if the {@link IPipelineStep|step} differs from what you are interested in. * Furthermore, it returns the {@link IPipelineStep|step's} result. * * @param expectedStepName - A safeguard if you want to retrieve the result. * If given, it causes the execution to fail if the next step is not the one you expect. * * _Without `expectedStepName`, please refrain from accessing the result, as you have no safeguards if the pipeline changes._ */ nextStep<PassedName extends PipelineStepName>(expectedStepName?: PassedName): Promise<{ name: typeof expectedStepName extends undefined ? PipelineStepName : PassedName; result: typeof expectedStepName extends undefined ? unknown : PipelineStepOutputWithName<P, PassedName>; }>; private _doNextStep; /** * This only makes sense if you have already run a request and want to re-use the per-file results for a new one. * (or if for whatever reason, you did not pass information for the pipeline with the constructor). * * @param newRequestData - Data for the new request */ updateRequest(newRequestData: PipelinePerRequestInput<P>): void; allRemainingSteps(canSwitchStage: false): Promise<Partial<PipelineOutput<P>>>; allRemainingSteps(canSwitchStage?: true): Promise<PipelineOutput<P>>; allRemainingSteps(canSwitchStage: boolean): Promise<PipelineOutput<P>>; }