Document Types¶
Creating a Document Type in ContraxSuite allows an administrator to choose and configure a specific set of data fields that they want to work with.
ContraxSuite automatically finds sentences or sections within a specific Document Type and then extracts an appropriate value for any given Document Field assigned to that Document Type. Through the creation of unique Document Types, ContraxSuite answers the exact questions that users want answered about any document assigned to a project (Each project in ContraxSuite has only one Document Type).
An administrator can create a new Document Type in order to find specific Document Fields for their project. A Document Field can be any data type, from a simple calendar date to complex clauses that require machine learning and model building. Each Document Field has a Field Type, which guides the system when searching for the right value. (Examples of Field Types include: Address, Amount, Choice, Company, Date, Duration, Percent, Geography, etc.)
Next, write Document Field Detectors to extract text for each Field. Field Detectors direct the system toward the sentence,paragraph, or section of the document in which the value being sought is located.
Document Field Detectors find the correct values for each Document Field via the following techniques:
Defined words, terms, and phrases that LexNLP - the legal-specific dictionary - can identify based on format and context. Examples of these terms or words include words in quotations, words in parentheticals, and/or words that are near grammatical markers such as “means”.
Field Types such as percents, durations, currencies, and geographies all follow recognizable patterns. ContraxSuite uses regular expressions to detect these sequences of symbols and characters.
How To Create a Document Type¶
To begin a custom Contract Analysis project, you must first create a Document Type. Once created, a user can add corresponding Document Fields and Field Detectors for the project’s different data types. Follow the steps below to create a new Document Type in ContraxSuite.
1. Go to Management in the Main Menu and select Document Types.
2. Click on Add Document Type in the upper right-hand corner.
3. A form will appear that has three fields to define and name your new Document Type.
Title: This is how the name of the Document Type will be displayed in ContraxSuite projects. This Title is what reviewers and other frontend users will see. This can be changed later.
Code: Enter a short reference code for the Document Type. This Code will be utilized in the system backend, and in the system admin interface. The purpose of unique Document Type Codes is to allow users to change a Title later, without affecting how the Fields in the Document Type function. The Document Type Code must meet the following criteria:
No uppercase letters
50 characters or less
Contains only Latin letters, digits, and underscores. You cannot have spaces in Codes. Use an underscore “_” instead of spaces to separate words in a Code (e.g., “document_type_code_one”)
Editor Type: The Editor Type dictates how the Fields in a Document Type are saved. This can be changed later.
The three choices for Editor Type are:
save_by_field
: Reviewers will be required to individually save each Field Value they enter in the Annotator interface. This is the best solution for an Admin or Project Manager who wants reviewers to add annotations to every Field.save_all_at_once
: Reviewers can save changes to Field Values all at once, using one “Save” button that will appear at the bottom of the page. This is the more efficient solution for an Admin or Project Manager who doesn’t expect reviewers to correct, edit, or add annotations, but simply wants reviewers to update some Field Values.no_text
: Reviewers will not be able to see the text of documents; only the Fields.
4. Click “Save” at the top of the window once you have entered all this data.
Creating a Document Type is only the first step. Once you’ve created a Document Type, to start using it in a meaningful way you will need to create and add unique Document Fields to be included in the Document Type. Document Fields represent the data points you want extracted from any document labeled as that Document Type. Click here to read more about creating Document Fields.
Managing Document Types¶
Once you have created a Document Type, you can click on that Document Type in the Document Type Configuration grid to make additional changes.
Categories: Creating Categories in a Document Type allows you to organize Fields within the user interface.
You can choose which Fields belong to which Category from a Document Field’s creation/edit page. (see Additional Forms on the “Document Fields” page for more)
You can also set a Field’s Category on the Document Type’s edit page, or delete a Field from the system with the grey “X” button.
Exporting and Importing Document Types¶
A Document Type incorporates the Document Fields, Document Field Detectors, and Categories assigned to it. Document Types - along with all of their Fields and Field Detectors - can be migrated to other instances of ContraxSuite. Migrating a Document Type into a different instance, or updating an existing Document Type in the same instance, can be accomplished using the import/export function.
1. Go to Management and select Document Types. On the Document Types Configuration Grid, select the Document Type you wish to export. Note: The “Export” button on the Grid page will only export data from the Grid.
2. On the Document Type’s edit page, click Export at the top
3. The Document Type should have been exported to your computer’s “Downloads” folder as a .json
file. Next, log in to the ContraxSuite instance you wish to import this Document Type into. Go to the Document Types page in this other instance, and click “Import”.
4. In the pop-up window, click “Choose File” and find and upload the .json
file you exported. Once you’ve selected the correct .json
file, you must choose one of four “Actions” from the “Action:” drop-down.
Option 1: Validate Only. Recommended first step for Document Types already in use. Choose this option if you want to first ascertain whether the data in a Document Type can validly be imported into your chosen environment. This option is useful if you first want to figure out what is different between the source ContraxSuite instance and the destination instance, but wish to address any problematic discrepancies manually.
Option 2: Validate and import if valid. Recommended approach for new Document Types. Choosing this option will allow the system to import the Document Type only if there are no conflicts. If there are conflicts, the import will be cancelled and the conflicts will be reported in the log.
Option 3: Import and force auto-fixes - Retain extra fields / field detectors. WARNING: May delete data. This should only be used for Document Types that are not in live usage and/or do not yet have live data. This will force resolution of conflicts between the Document Type in the destination instance and any changes made in the source instance, and potentially delete invalid Field configurations, though all extra Fields and Field Detectors will still be stored. If you choose this option, it’s a good idea to follow up by reviewing Fields and Field Detectors for any extra data that may have been created by the import.
Option 4: Import and force auto-fixes - Remove extra fields / field detectors. WARNING: May delete data. This should only be used for Document Types that are not in live usage and/or do not yet have live data. Choosing this option will force an older version of this Document Type in the destination instance into a configuration that conforms exactly to its configuration in the source instance, potentially deleting Fields, Field Detectors, Field Values, and/or Field Annotations in order to do so.
5. Next, select the “Source CS Version”, if the source instance that produced the .json
file is a different major version of ContraxSuite than the destination instance. This option is set to “current” by default. In most circumstances, the default “current” option will work.
6. Finally, check or uncheck the box marked “Documents: Cache document fields after import finished”. Checking this box will ensure that all document fields are cached after importing.
Recommended Approach For Preserving User-Entered Data¶
We recommend that the following procedure be completed during maintenance hours
1. Select “Validate Only” and uncheck “Cache document fields after import finished”. This will not change anything in your production data or configurations; instead, you will just have a report of issues to resolve. If the Document Type in question has many Fields, and you’ve been careful about only making changes on a development (“dev
”) instance, then you may skip to #5 below.
2. When this task completes, the system will display either “SUCCESS” or “FAILURE”. Click View Logs to scan the log. You may have to skim the entire length of the log to find the word ERROR
. If ERROR
does not appear in the log, then there were no issues found and you can skip to #5 below.
3. Potential conflicts will lead to the message “VALIDATION ERRORS OCCURRED DURING VALIDATION”. The validation errors will be numbered and will look similar to the following:
"VALIDATION ERROR 1. Unable to update field #[UUID HERE] field_code_name_here.
Field type has changed, old field type is "Amount", new field type
is "Floating Point Number." Existing document field values become
invalid and will be removed. User entered X values, automatically
detected Y values. You need to set auto-fixes option to continue.
4. It is recommended you manually resolve all of the errors, rather than choosing any of the options that force changes (“Option 3” and “Option 4” above). Manually resolving errors will better inform you of what data you may be deleting, and this is also the safest method for making sure you have not deleted any data inadvertently. This manual review may involve deleting empty Document Fields, deleting extra Field Detectors that were already deleted from the source instance, etc. You can see which Field is being referred to by copying the UUID from the error code and pasting it at the end of this URL https://www.YOURCONTRAXSUITEINSTANCENAME/advanced/admin/document/documentfield/[UUIDHERE]
.
5. Once you have resolved all of the errors documented in the log, run the “Import” task again, choose the .json
file, and then select “Validate and import only if valid” in the “Action:” drop-down. Leave “Cache document fields after import finished” checked.
6. If all errors are resolved, the “Import” task will succeed. If you have not resolved all remaining errors, the task will fail. Look through the logs, and if necessary repeat these steps, to discover any additional unresolved errors.
Potential Problem Scenarios in Export/Import¶
WARNING: There is a lot of potential for conflicts when a Document Type exists on 2 or more ContraxSuite instances and you wish to update one instance with the Document Type configurations from another. It is STRONGLY recommended that you only ever make changes to Document Types in one “development” instance, and then migrate all new configurations to your main project environment once they are ready. Failure to do so is likely to create extraneous Fields and Field Detectors that cannot be synchronized, which can lead to downstream problems like accidental deletion of data, malfunctioning Fields, or other errors.
Below are examples of potential problem scenarios that can occur as a result of improper Document Type migration.
Conflict in Field Types: Imagine you have a String Field that you later decide to convert to a Number Field. If you change the Field Type on the source instance, the data will be deleted when you import the Field to the destination instance, because the string value cannot be converted to a number. This issue could lead to downstream deletion of string data from the Field.
Duplicate Fields: Imagine you have a Field with the Field Code “rent_amount” on one instance. On another instance, you create a Field and re-use that same Field Code of “rent_amount”. There is no inherent conflict when using the same Field Code on two different instances, but if you try to migrate this Field to an instance that already has a Field with the same Field Code, the system will not be able to resolve the conflict and will attempt to create a duplicate Field or delete one of the Fields. Depending on which of the four “Action” options you choose during the import process, this could lead to the “Import” task failing. Document Fields and Document Field Detectors are migrated and tracked based on their Codes, and on their unique UUIDs. Be sure to monitor both Field Codes and Field UUIDs for duplication issues.
Extra Field Detector: The system will always try to synchronize Field Detectors during an “Import” task. Imagine that you did not choose an import option that forces auto-fixes, and you have 3 Field Detectors for one Field. You decide that only 1 Field Detector is worth keeping, so you delete the other 2 Field Detectors from the source instance. However, your destination instance still contains a version of the Field that has 3 Field Detectors. Importing this Field from the source instance to the destination instance may result in a validation error. We recommend that in situations like this, you manually review and delete any extra Field Detectors from both your source instance and your destination instance, so that you are positive of what has been deleted. Depending on which of the four “Action” options you choose during the import process, failure to perform this kind of manual review could lead to the “Import” task failing, or to accidental deletion of data.