Parsing- Technical Specifications and Sovren FAQ

Overview

This article contains information about the frequently asked question about Invenias parsing.Invenias parsing capabilities are powered by the latest version of Sovren, one of the world’s leading CV/Resume Parsing engines. There are some choices we've made around our specific implementation of Sovren and how we handle the data they provide us, but you will find Sovren's full technical Specifications here and below we've summarised some responses to the most Frequently asked questions.

FAQ

What document formats can Sovren handle?

Sovren essentially supports any non-image resume and CV format, including all of the popular job board formats and social and professional networks. This includes:

  • Microsoft Word (all versions including DOCX)
  • PDF*
  • HTML
  • Rich Text (RTF)
  • OpenOffice 2.+
  • Microsoft Office HTML
  • Text Excel

If you are seeing poor parsing results from a PDF, that problem is almost certainly caused by the PDF is a file that looks great, but internally is corrupted. Here is Sovren's Article on the topic, but there is a simple way to find out if a PDF is corrupt. Open the file using the free Adobe Acrobat Reader software, choose File -> Save as Text and save the file. Then open it using a text editor such as Notepad or UltraEdit.

Does Sovren support Parsing in languages other than English?

The Sovren parser has auto language detection so you will never have to tell the parser what language a CV is written in. They currently support parsing in the following languages:

  • Bulgarian
  • Chinese (Simplified)
  • Croatian
  • Czech
  • Danish
  • Dutch
  • English (all dialects)
  • Estonian
  • Finnish
  • French (all dialects, including Canada, Belgium, Italy, Liechtenstein, and others)
  • German (all dialects including Switzerland, Liechtenstein and Austria)
  • Greek
  • Hungarian
  • Italian (all dialects)
  • Latvian
  • Lithuanian
  • Norwegian
  • Polish
  • Portuguese (all dialects)
  • Romanian
  • Russian (including Belarusian)
  • Slovak
  • Slovenian
  • Spanish (also Catalan, Galician, Basque)
  • Swedish

In addition, Sovren support automatic "Locale Detection" which is used to identify the region or country and apply some logic around parsing of an address, telephone, etc.

Does the Sovren resume parser service store my resumes?

No, the Sovren resume parsing SaaS does not store any CVs/Resumes. All parsing is done in-memory so there is never any data written to a file system or database.

Is the Sovren Service secure and scalable?

Invenias integrates with Sovren via a REST API. Sovren's service uses SSL to handle encrypted links between services and does not store your data. This service runs on Amazon Web Services (AWS), scales both vertically and horizontally and has no document limits.