Blogs

Looking Ahead to 2013 - 19 December, 2012

What the New Year offers and how Pingar expects to deliver Learn More


Pingar at ShareFEST 2013 - 9 April, 2013

“Extracting and Mapping SharePoint Content to Create a Custom Taxonomy” Learn More


Processing 100+ GB of data using Pingar API on Amazon - 9 July, 2012

Gene Golovchinsky, a Senior Research Scientist at FujiXerox, approached us with an interesting task. Can Pingar API scale to process a rather large dataset: approximately 1.7 million documents retrieved from CiteSeer (a repository of research publications)? Such documents being scientific publications typically average at 6 pages of text in small font. So after extracting text from all of these documents, the dataset adds up to over 110GB of uncompressed text. This post examines the methodology used and the results we found during our experimentation. Learn More

 

Explore Pingar