(1/5) Legal text: from document to "data" approach
top of page
Search

(1/5) Legal text: from document to "data" approach

Organizations subject to increasingly complex regulations can no longer maintain a documentary approach to legal standards. They need to switch to a data-centric approach.

This article is the first in a series of five that we will be publishing in the coming weeks.


A legal document is a text which has a special status because it is normative. It obliges people to carry out specific actions or not according to established rules. It is the "performative" [1] aspect of the legal text that is primary in its use and structuring.

Elements beyond the text carry this performative dimension: authors, people targeted, circumstances, development, and revision process: who reread it, who validated it, who was informed.


A legal document is thus made up of:

  • A text (Content of the document, Annexes, Image / Diagram / Table)

  • A context or paratext [2] (language of the document, date of creation, date of publication)

  • References or hypertexts (external references, internal references)

  • Indexing or metatext elements (tags / themes, targeted people)

  • A drafting or architext process (writer, proofreaders, validators, consent, evidence)

These different levels are present in any legal document, explicitly or implicitly, and can be managed in a paper or scanned text format for a small and homogeneous set of legal documents.

However, as soon as this set takes on a larger scale and becomes a veritable body of standards, namely constantly updated documents, according to plural regulations, with heterogeneous validation processes and which apply to different target populations, the mass of data to be handled becomes too large to be managed only in textual format.

In order to take into account these elements of the normative corpus, it is necessary to go beyond the traditional documentary. A data approach allows organizations to enriched their norms at all stages of the development process.

After moving from text to hypertext (links between documents) then to metatext (indexing using tags), the legal corpus initiates a new evolution of the document: the architext [3].

We will explore these concepts in future articles.


To schedule a demo, please contact us here.


Pauline Armary is a data analyst at Legalcluster.

After literary and philosophical studies, Pauline applied her expertise in clustering by resemblance in accelerating legal operations and compliance digitalization.


[1] J. L. Austin, Quand dire, c’est faire, 1962

[2] G. Genette, Palimpseste, 1982

[3] This term comes from the terminology defined by Gérard Genette in Palimpseste. The concept of Architext is the central element of Legalcluster's architecture.

bottom of page