Labkey tutorial

Installation

Windows (english version)

Grab the software and run the exe. It should install MySQL, Java, Tomcat and Labkey binaries. Original Labkey advice here. Skip the registration steps etc. The configuration is described here.

Windows (non-english)

Manual install is required. Failed for some older PC configurations and operating systems (32-bit Win-7).

Linux and Mac

Find instructions on a separate page.

Initial settings

At the first log-in the Labkey will ask you to create the administrator user by providing your email address and a password. After that you are the administrator of your labkey copy.

On Settings -> Site -> Admin Console under Admin Console Links/Configuration/File check location of your data files. It is probably something like /usr/share/labkey/files or something similar.

Basic usage

Administrative tasks

Create a project

Then create a project called Tutorial by pressing the button "Create a project"

Set the project mode to Study and allow use to "My user only".

Once the project is created, we will move the web parts around, so under settings (gear wheel icon (⚙) next to your user name) select "Page Admin Mode". Now you should see little arrows next to the frames on your web page. Each frame is called a Web Part and can be either included or excluded from the page. For now, we remove the "Study overview" web part and add a "Study List" web part. Once added it should confirm that there are

No Studies found in project tutorial or child folders

Create the Study in a sub-folder

Again we go in Management (⚙→ Folder → Management). Then we locate "Create subfolder" action button above the layout of the folders, and we create a AD subfolder. Use name as title should be kept ticked, and choose Study as the Folder Type. Allow the program to Inherit permissions from the parent folder and Finish the process.

Now we keep the Study Overview Web Part and we go on and Create Study.

Keep the default Study Label, but change Subject Noun (Singular) to Patient, (Plural) to Patients and Subject Column Name to PatientId.

Under Visit/Timepoint Tracking select Assigned Visits and change Security Mode to Basic with Editable Datasets. The Specimen Management should be kept at default Standard Specimen Repository setting. Press Create Study to complete the task.

Adding data

Populating a study

Use demo files provided: - Demographics - MMSE

Load the Demographics first. Under Manage select Manage Datasets, Create New Dataset. Under Short Dataset Name use Demographics, allow Labkey to Define Dataset Id Automatically and check Import From File. Click Next.

In Browse, select randomDemo.xlsx just downloaded. Keep columns up to Birth and uncheck the rest. Under ParticipantId mapping select PatientID, and VisitID under SequenceNum. Once Import is complete, click Manage, Edit Definition and check Demographics box.

Next, create the MMSE dataset by going into Manage, Manage Datasets, Create New Dataset with Short Dataset Name MMSE, Define Dataset Id Automatically checked and again, do Import from File. Keep the first three columns (up to MMSE), link ParticipantId to PatientID and Sequence Num to VisitID and Import.

Under Grid Views, select Customize Grid and deselect the Visit box and select sequence num. Both are hidden under Participant Visit.

Finally, add the ADPD dataset. Again, create a dataset with the Short Dataset Name ADPD, with Import from File checked. Keep only the PatientID, VisitID and ADPD columns, link PatientID to PatientID and SequenceNum to VisitID if needed and click Next. After the data is inserted, change the Grid View to hide VisitID and show SequenceNum. Click on the arrow next to the Sequencenum column name and filter visits greater than 1. Then in the column name row check the box at the left first followed by a click on the trash can. This deletes all (empty) entries for the second visit.

The data is now ready.

Basic data analysis

Summary statistics

Labkey can calculate simple statistics on dataset variables. As an example, go to ADPD dataset (Home→tutorial→AD, Overview tab, click on datasets, select ADPD) and click on the ADPD column. Following options are available:

  • sort the rows by ADPD value
  • filter rows by operations on ADPD values
  • do Summary statistics

Clicking on the statistics menu item, typical statistical values are displayed and can be selected to be peristently displayed in the last row of the dataset.

Providing simple views

Navigate to ADPD dataset. Click on the Grid view (a table-like icon just below the dataset name) and select CustomizeGrid. In the left column, go all the way down until you see (the shaded) DataSets field. Click on the + check box to see available datasets. Below, Demographics and MMSE dataset appear. Click on the Town field name. The Town variable appears in the right list of displayed columns. Below the variable list a Save button is available; click it and save the view as Analysis. Now, Town appears next to ADPD column.

Click on Chart (next to Grid) and select Create Chart. For x axis select town by dragging it over, for y select ADPD, for aggregate method select Mean and click on Apply. The report can be Saved by clickin on Save (top right corner) and adding it a name (ADPD-Town relationship).

Go back to Overview and the back to Clinical and Assay Data - ADPD-Town relationship appears as a separate Data View.

Task Plot ADPD vs. MMSE score at the first visit.

A little bit beyond basic usage: R

Additional tools must be used for more complex analysis.

Add R as a scripting engine

Labkey relies on R to perform statistical calculations. Follow instructions here to set R as a script engine. For additional packages in linux - run R as the tomcat8 user and run install.package(X). For statistical calculations Hmisc is a very useful if slightly large addition. To install, run

install.packages("Hmisc", repos="http://cran.r-project.org")

as a tomcat8 user from a shell session on target PC. Confirm installation to user space.

Perform statistical analysis with R

With R, additional analysis scripts can be used. For example, we can calculate the correlation coefficient of MMSE and ADPD. For this we must first create a Grid View with both columns available. We start with the ADPD dataset and import the MMSE column from the MMSE dataset. The we select Charts -> Create R Report. An editor opens where R commands can be entered. To caclulate correlations, the bottom lines should suffice:

#library('Hmisc')
colnames(labkey.data)
x<-labkey.data[['datasets_mmse_mmse']]
y<-labkey.data[['adpd']]
X<-matrix(c(x,y),nrow=length(x))
cor(X)
#rcorr(X)

If Hmisc package was already installed, then we can add it at the begining (# is a comment in R, so erasing it will make R execute the statement), and use rcorr function on the matrix which also provides the degree of belief parameter p.

Once the script is completed, pressing on the Report tab will give you the R output. The expected correlation of the dataset is 0.3.

Beyond basic usage: modules

Modules are infrastructural elements that extend labkey's functionality. They are written in a variety of programing languages (sql, javascript) and require manipulation of the server. Two examples are shown.

Adding variables through ETL scripts

ETL (Extract-transform-load) scripts is a Labkey infrastructure providing a simple interface for repetitive tasks on imported data. Here we will use the program to insert a new dataset with a new variable, the change in MMSE between visits.

First create a new dataset via Manage -> Manage Datasets -> Create New Dataset.Set the short Dataset Name to MMSEdiff and make sure Import From File is unchecked. Proceed to the Dataset Definition Editor. All system variables are already set, we only need to add a variable with the name MMSEchange, label MMSEchange of type Number(Double). Save the dataset.

Next we have to add an ETL module to the labkey installation. Modules are programable extensions to Labkey functionality. They have a predefined layout and should reside in a specified location. This location can be found by opening Site -> Admin Console -> Server Information web page and looking under Webapp Dir. The module location should be the content of the Webapp Dir variable where labkeywebapp should be replaced with externalModules. If this directory does not exist, create it (as tomcat8 in linux). Extract etlCalculate module at that location and restart Labkey.

The relevant script is in etlCalculate/queries/study/calcDiff.sql and uses sql script commands. More on SQL can be found in a separate note.

Once Labkey is back on-line, go back to the Study folder and go Settings -> Folder -> Folder Management and under Folder Type tab make sure that etlCalculate module is checked.

Back in Overview tab of the AD study folder, go to Settings -> PageAdminMode and add Data Transforms web part. A line with a button Run should appear. As you press it, the deployed ETL script from etlCalculate module is run which populates the MMSEdiff dataset with differences in MMSE between the first and the second visit.

Try to determine the coefficient of correlation betweem MMSEchange and ADPD!

Use trigger scripts for common data related tasks

A separate infrastructure in Labkey are the trigger scripts, which respond to manual entry of data into a dataset. They respond to data insertion, update or deletion. They are written in yet another programming language, Javascript and packed in a Labkey module.

We will test this functionality with a separate dataset where pointer to images are stored. Call this new dataset PETscan and create it in the same AD folder via standard steps (ie. Manage -> Manage Datasets -> Create New Dataset). Uncheck Import From File and click Next. In Dataset Definition Editor add a single variable called ImagingID (Label can be left empty, name will be used) of type Text(String). Save the new dataset.

The task is that for every entry in the dataset we create directories

[BASE]/PatientID/ImagingID/PET
[BASE]/PatientID/ImagingID/CT

where [BASE] is the core directory of our study, which we get by:

[BASE]=[LABKEY_BASE]/[project]/@files
#with
[LABKEY_BASE]=/usr/share/labkey/files
[project]=tutorial

The [LABKEY_BASE] variable can be read from Settings -> Site -> Admin Console -> Configuration/Files. This is the variable we will need later on.

The created directories will be then seen in the Labkey's file browser. To add the file browser to our project, go to Overview and add the Files Web Part(entering Page Admin Mode via Settings if needed).

The module should be expanded in the externalModules directory as before. Since the labkey paths are slightly different, you should edit the javascript file

externalModules/neuroModule/queries/study/PETscan.js

and change the labkeyBase to appropriate value. Restart the server. Go to PETscan dataset and via (+) icon add a single entry. The output can be observed on the Javascript console accessible via Admin -> Developer Tools -> Javascript Console. In Files Web Part new directories should be created.

Adding links to datasets

We will also include the link to the created directories as entries in the grid. To do that, navigate to the PETscan dataset (Overview -> datasets -> PETscan) and click on Manage, then on Edit Definition. Add a variable called PET of type Text(String), and on the right menu add the following text to the URL field:

http://localhost:8080${contextPath}/_webdav${containerPath}/%40files/${patientID}/${ImagingID}/PET

In the same menu, click on Advanced tab and set the Default Value via SET VALUE button. For PET enter [PET], keep ImagingID default value empty.

Click Save and go to Data View. An additional (empty) column appears as the default will only work for new entries. Now, create a new entry or edit an entry and set the PET field value to [PET]. After Save-ing, the [PET] label should appear clickable - by clicking it Labkey navigates to the directory potentially holding the asociated PET scan.

Useful R packages

Cairo

install.package('Cairo',dependencies=True)

KNITR

#requires 
#  mesa-common-dev libglu1-mesa-dev libssl-dev 
#  libcurl4-openssl-dev cargo (python3.5) 
#  libmagick++-dev (libmagick\\+\+-dev) 
install.package('knitr',dependencies=True)

Because we are on a headless server (one not running X11), we get a warning:

Warning in rgl.init(initValue, onlyNULL) :
  RGL: unable to open X11 display
Warning: 'rgl_init' failed, running with rgl.useNULL = TRUE

It should be OK since X11 is not explicitly required.

Kable extra

For hot tables:

install.packages("kableExtra")

Remote database access

The strategies presented above rely on server-side Labkey routines and additional programs. However, the data can be manipulated from client programs as well. Interfaces for common programs are available from Labkey, and are in fact wrappers for HTTP interface calls. The HTTP API is presented below.

HTTP Interface

To get data to/from client application, HTTP protocol is (ab)used for programatical calls. The Labkey calls this the HTTP API. The idea is that correctly phrased HTTP URLs interact directly with the Labkey core and responding with data in JSON format.

A typical situation is a direct SQL call that returns a subset of a dataset. This is done with a selectRow query with the syntax:

http://<Server>/labkey/query/<MyProj>/selectRows.api?schemaName=study&query.queryName=[query]&query.[varName]~[oper]=[value]
#where e.g.:
<Server>=http://localhost:8080 # a server instance
<MyProj>=tutorial/AD #the folder path
[query]=MMSE # a particular dataset
[varName]=SequenceNum # a variable in the dataset
[oper]=gt # operator, such as eq-ual, dateeq-ual, etc.
[value]=1.5 # value for the operator

The HTTP requests can be sent via browser or other url-aware program. For C++, curl, qt5 and similar provides a comprehensible set of URL commands. The return value is given as a text string, which the browser display as text and can be parsed by JSON aware programs.

For example test the response to selectRows command with:

http://localhost:8080/labkey/query/tutorial/AD/selectRows.api?schemaName=study&query.queryName=MMSE

Certificate and key extraction

openssl pkcs12 -in astuden.p12 -out astuden.crt -nokeys -clcerts
opensll pkcs12 -in astuden.p12 -out astuden.key -nocerts 

links

social