UNPKG

@juspay/neurolink

Version:

Universal AI Development Platform with working MCP integration, multi-provider support, voice (TTS/STT/realtime), and professional CLI. 58+ external MCP servers discoverable, multimodal file processing, RAG pipelines. Build, test, and deploy AI applicatio

64 lines (63 loc) 2.24 kB
/** * PowerPoint (PPTX) Processing Utility * * Extracts text content from PowerPoint (.pptx) files by treating them * as ZIP archives and parsing the slide XML files within. * * PPTX files are ZIP archives containing: * - ppt/slides/slide1.xml, slide2.xml, ... — slide content * - ppt/slideMasters/ — master slide templates * - ppt/slideLayouts/ — slide layout definitions * * Text is extracted from `<a:t>` elements in the slide XML files. * Slides are sorted by number and presented in reading order. * * Uses `adm-zip` (already a project dependency) for ZIP extraction. * * @module processors/document/PptxProcessor * * @example * ```typescript * import { PptxProcessor } from "./PptxProcessor.js"; * * const text = await PptxProcessor.extractText(buffer); * if (text) { * console.log("Extracted text:", text); * } * ``` */ /** * Static utility class for extracting text from PPTX files. * * Designed as a static class (not extending BaseFileProcessor) because * PPTX processing is straightforward ZIP+XML extraction and does not * need the full download/validate/process pipeline of BaseFileProcessor. */ export declare class PptxProcessor { /** * Extract all text content from a PPTX buffer. * * @param content - Raw PPTX file buffer * @returns Formatted text content with slide headers, or null if no text found * @throws Error if the buffer is not a valid ZIP/PPTX file */ static extractText(content: Buffer): Promise<string | null>; /** * Extract text strings from a slide XML document. * Finds all `<a:t>` elements and returns their text content. * * @param xml - Raw XML string from a slide file * @returns Array of text strings found in the slide */ private static extractTextFromXml; /** * Extract text from specific slides in a PPTX file. * * Called by the `extract_file_content` tool for targeted slide access. * * @param content - Raw PPTX file buffer * @param slideNumbers - Array of 1-indexed slide numbers to extract * @returns Formatted text from the requested slides */ static extractSlides(content: Buffer, slideNumbers: number[]): Promise<string>; }