Hello, dear community!
We continue to see exciting improvements released since 3.7. Here is a synopsis of highlights from the version 3.8 updates release.
Translated Redaction Labels
Private AI supports text processing in multiple languages, and redaction markers are now also available in multiple languages. See the Supported Language documentation for more information on which languages support translated labels.
WebSocket Context
The Private AI version 3.8 updates now maintains context between WebSocket messages. Similar to link_batch, text inputs are processed together as a single input. This helps deliver more consistent redaction markers across a series of interactions, like an SMS chat conversation.
Black Box Image Redaction
Private AI offers the ability to perform black-box redaction on images. For more information, please visit the image processing documentation.
Examples, guides and more
We continue to improve our docs site. We’ve introduced a new Getting Started guide, additional documentation around API fundamentals, and additional guides.
New Language Support
Extended support for the Georgian language has been introduced, expanding the model’s linguistic capabilities to 53 languages.
New Entity Type
A new entity, `LOCATION_ADDRESS_STREET`, has been added. This entity captures street names and numbers, including unit numbers, providing more granularity than the existing `LOCATION_ADDRESS` entity, which captures complete addresses.
New Ram Requirements
The container now requires 64 GB of RAM to ensure reliable and performant operation when processing files. A warning message will be presented if the system does not have sufficient memory.
Model Improvements
These updates signify a substantial step forward in the model’s functionality and security, broadening its applicability and accuracy across various languages and data types.
Model updates since 3.7 include:
- Enhanced detection of numerical entities in multiple languages (French, Spanish, English, Japanese, Portuguese, and Dutch), including specific improvements for social security numbers (SSN) and credit card numbers.
- Improved identification of partial credit card and SSN numbers, especially spoken or transcribed, across several languages (Spanish, Dutch, Korean, German, Italian).
- Better detection of `BANK_ACCOUNT`, `MEDICAL_PROCESS`, `TIME`, and `MONEY` entities in various languages, with specific enhancements for PII detection in Mandarin (simplified script) and Spanish, focusing on regional equivalents of identifiers.
- General improvements in detecting numerical entities written as words, benefiting multilingual text processing.
Cheers,
The Private AI Product Team
Sign up for our Community API
The “get to know us” plan. Our full product, but limited to 75 API calls per day and hosted by us.
Get Started Today