Signup | Login

Similarity Search for Content Matching | TextWise LLC

Similarity Search for Content Matching | TextWise LLC
Aug 31, 2011 - textwise.com - The TextWise SemanticHacker API provides a match service call that analyzes the text or Web page provided in the call and returns a Semantic Signature and a match. Matching documents via their

ocropus - The OCRopus(tm) open source document analysis and OCR system - Google Project Hosting

Apr 19, 2011 - code.google.com - OCRopus(tm) is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual

linux - extracting text from MS word files in python - Stack Overflow

Apr 19, 2011 - stackoverflow.com - for working with MS word files in python, there is python win32 extensions, which can be used in windows. How do I do the same in linux? Is there any library? edited Jun 9 '10 at 11:51 asked Sep 24

Antiword: a free MS Word document reader

Apr 19, 2011 - winfield.demon.nl - Among the platforms happily ignored by Microsoft, is -naturally- RISC OS, the platform that goes with the computers that were made by Acorn computers Ltd. of Cambridge in the UK. Today the platform is

Apache POI - the Java API for Microsoft Documents

Apr 19, 2011 - poi.apache.org - The Apache POI team is pleased to announce the release of 3.8 beta 2. This includes a large number of bug fixes and enhancements. A full list of changes is available in the change log.

PDFMiner

Apr 19, 2011 - unixuser.org - Last Modified: Sun Feb 27 10:51:18 UTC 2011 Python PDF parser and analyzer Homepage Recent Changes PDFMiner API What's It? Download Where to Ask How to Install Command Line Tools Changes TODO Related