General Concepts

Skip to end of metadata
Go to start of metadata

Overview


What is Opal ?

Opal is OBiBa's core database application for biobanks. Participant data, once collected either from OBiBa's Onyx application, must be integrated and stored in a central data repository under a uniform model. Opal is such a central repository. Current Opal version can import, process, and copy data. Opal is typically used in a research center to analyze the data acquired at assessment centres. Its ultimate purpose is to achieve seamless data-sharing among biobanks.

For more information on Opal future, see Opal description on OBiBa.

Functional Units

Following the OBiBa paradigm of separation of concerns, the concept of "Functional Units" defines how to protect participant's privacy while exchanging data with Opal. The exchanges can be in both directions: imports and exports. Current version of Opal only support imports. A functional unit may be part of your organization or not.

Participants privacy is ensured by:

  • do not communicate unit's private participant identifier (use of a shared key instead),
  • encrypt the data that are exchanged between two units.
Full Size

Data Encryption

To protect participants privacy the file that is exchanged between two units and that contains some participants identifiers with data, should be encrypted. It is part of the functional unit description in Opal to define a unit's keystore that contains the encryption key pairs (private key and certificate). Specifying the unit from which data are imported, in addition to allow Participant Key Separation, also allows to decrypt the imported file.

The importation process of Opal is the following:

  1. the import is always done from a specified functional unit,
  2. if imported file is encrypted, Opal will:
    • find the encryption private key associated to the certificate that was used to encrypt the file,
    • decrypt in memory the file.
  3. then Opal imports the data following the Participant Key Separation.

Participant Key Separation

Participant's privacy involves several identifiers:

  • one private key in each functional unit: this key is for a unit internal usage and should never be visible by other units.
  • one shared key per functional unit pair: this key is used to exchange participants data between two units without exposing the source unit private key.
  • variable values that are personal information.

Opal separates the participant identifiers from the participant's data in two databases:

  • the Opal key database will store the participant identifiers and personal data,
  • the Opal data database will store anonymous participant's data.

One participant is identified in these two databases by a unique identifier which is the Opal participant's private key. Opal is able to find a participant from a given shared key.

The importation process of Opal is the following for one participant:

  1. the import is always done from a specified functional unit,
  2. the imported participant identifier is the shared key between this functional unit and Opal,
  3. from this shared key, Opal searches for the participant in the Opal key database:
    • if found, the Opal participant is identified internally,
    • if not found, a new participant is created in Opal:
      • this participant is assigned a Opal private key,
      • the shared key is stored in the Opal key database.
  4. the participant now exists in Opal and its data are imported:
    • if a variable is identified as being about personal information, its values are stored in the Opal key database,
    • else the variable values are stored in the Opal data database.

Integration with Onyx

Onyx is the OBiBa's solution for collecting participants data. Data exported by Onyx can be directly imported in Opal. It is not necessary to define a functional unit per assessment centre site: one functional unit will describe all of them, and this unit's keystore will contain the encryption key pairs for all the sites.

Variables Organization

The variables are organized in an abstract way, independently of the way they are persisted.

The following diagram presents a 'traditional' view of what is a table:

Full Size

The following diagram shows the relationships between the different concepts:

Full Size

Variables and Categories

A variable describes a set of values. The values of a variable are all of the same type. Possible value types are:

  • integer
  • decimal
  • text
  • binary
  • locale
  • boolean
  • datetime

A variable is about an entity, i.e. all the values for a variable are from the same entity type. Possible entity types are:

  • Participant
  • Instrument
  • ...

A category describe some of the possible values of a variable. A category is associated to one and only one variable.

Datasources and Tables

A variable is in one and only one table.

A table has several variables and is in one and only one datasource.

A datasource has several tables. A datasource is not a database: it can be persisted in a database, using different schema. It can also be persisted in a file in xml or Excel formats. It is important to understand that Opal separates the formal description of the variables from the way they are persisted. This gives to Opal a lot of versatility.

The copy command will allow to copy variables from one table to another table in a different datasource.

Attributes

Datasources, variables and categories have attributes. These attributes provide additional meta-information. An attribute is made of:

  • a name (required),
  • a locale (optional), that specifies in which language is the attribute value,
  • a value (required even if null).

Example

Example of a variable asked_age which has the following attributes:

Name Locale Value
label en What is your age ?
label fr Quel est votre age ?
questionnaire   IdentificationQuestionnaire
page   P1

The variable asked_age has also some categories (with their attributes):

Name Attributes
888 label:en=Don't know
label:fr=Ne sait pas
999 label:en=Prefer not to answer
label:en=Préfère ne pas répondre

Entities

The entities can be of different types:

  • Participant (most common)
  • Instrument (provided by Onyx)
  • Workstation (provided by Onyx)
  • ... (any that might fit your needs)

Each entity has a unique identifier. An entity can have several value sets, but only one value set for a particular table.

Value Types

The following table gives more information about the textual representation of a value, given a value type:

Value Type Value as a String
integer The string value must all be decimal digits, except that the first character may be an ASCII minus sign '-' to indicate a negative value. The resulting integer has radix 10 and the supported range is [-2 63, 2 63-1].
decimal As described by Java Double documentation.
text As-is.
binary Base64 encoded.
locale String representation of a locale is <language>[_<country>[_<variant>]] (for instance en, en_CA etc.) where:
  • language: lowercase two-letter ISO-639 code.
  • country: uppercase two-letter ISO-3166 code.
  • variant: vendor specific code, see Java Locale.
boolean True value if is equal, ignoring case, to the string "true".
datetime Date times are represented in ISO_8601 format: "yyyy-MM-dd'T'HH:mm:ss.SSSZ"

Value Sets and Values

A value is associated to a variable and is part of a value set. Each value set is for a particular entity and a particular table. An entity has a maximum of one value set in one table.

Fully Qualified Names

Each of these elements has a short name. A fully qualified name will identify them uniquely:

  • Datasource fully qualified name: <datasource_name>
  • Table fully qualified name: <datasource_name>.<table_name>
  • Variable fully qualified name: <datasource_name>.<table_name>:<variable_name>

The fully qualified name might be useful for disambiguation.

Examples

Following the example of the asked_age variable, its fully qualified name could be: opal-data.IdentificationQuestionnaire:asked_age

Derived Variables

A derived variable is a variable which values are computed using a script. This script is expressed using the Magma Javascript API.

Views

Opal deals with variables and values in tables. Views are here to:

  • define a subset of a table, both in terms of variables and values,
  • define Derived Variables that are to be resolved against 'real' ones.

These virtual tables are then manipulated just like standard tables (for instance they can be copied to a datasource).

Given one table:

Table1
entity Var1 Var2 Var3
1 Value1.1 Value2.1 Value 3.1
2 Value1.2 Value2.2 Value 3.2
3 Value1.3 Value2.3 Value 3.3

A view can be defined so that the resulting 'table' may be:

View1
entity Var1 Var3
1 Value1.1 Value 3.1
3 Value1.3 Value 3.3

or:

View2
entity DerivedVar = function(Var1, Var2)
1 function(Value1.1, Value2.1)
3 function(Value1.3, Value2.3)
Search Opal Documentation
Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.