UNPKG

textract

Version:

Extracting text from files of various type including html, pdf, doc, docx, xls, xlsx, csv, pptx, png, jpg, gif, rtf, text/*, and various open office.

12 lines (10 loc) 251 B
# Setting up tesseract on mac ``` sudo port install tesseract sudo port install tesseract-chi-sim cat "export TESSDATA_PREFIX='/opt/local/share/tessdata/'" >> ~/.bash_profile ``` # antiword on mac `brew install antiword` # unrtf `brew install unrtf`