Index

1 Change table
10 Application for Data Utilization
12 FAQ

AMED Data Utilization Platform Operating Manual(General)

Index

1 Change table
2 Common Information
3 Login
4 Common Features
5 Dashboard(TOP)
6 Metadata
7 Project Information
8 Pre-Research
9 News & Help
10 Application for Data Utilization
11 Supplement
12 FAQ

1 Change table

The following changes were made to 2024/7/23.
The manual is still in the old system and should be read accordingly.
・Change the order in which menus are displayed


・Change the display order of Data Management


・Menu name change
Table of old and new menu names
Old labelNew label
メタデータMetadataメタデータ検索Search Metadata
処理環境アップロードAnalysis Environment UploadDockerイメージアップロードUpload Docker Image
アップロード済み処理環境一覧List of uploaded analysis environmentDockerイメージ一覧Docker Images
データファイルアップロードData File UploadユーザファイルアップロードUpload User File
データファイル一覧Data File Listユーザファイル一覧User Files
予備的処理結果ファイルコピーPre-Research Results File Copyアレル頻度データをユーザファイル(VCF)としてコピーCopy Allele Frequency Data as User File (VCF)
ファイルダウンロードPre-Research Preset File DownloadダウンロードDownloads
サンプルセット登録Sample Set Registrationアレル頻度データ作成Generate Allele Frequency Data
サンプルセット一覧Sample Set Listアレル頻度データ一覧Allele Frequency Data
処理環境登録Analysis Environment RegistrationDocker環境作成Setup Docker Environment
処理環境一覧Analysis Environment ListDocker環境一覧Docker Environments
連携システムの利用Use of Computer Systems on Collaborative Center(※ページ上部も同様)連携システムの利用Login to Computer System on Collaborative Center(※ページ上部も同様)

・Changed the name of "データ利活用プラットフォーム基本サンプルセット" in the sample set list to "CANNDs_23K"
・File downloads at the bottom of the Data Management to a separate screen

2 Common Information

2.1 Access URL

This manual describes the operations of the Health and medical R&D data integration and utilization Platform (hereafter, the AMED Data Sharing Platform). The AMED Data Sharing Platform can be accessed from the following URL.
https://prod-www.cannds.amed.go.jp/web/login/init

*In order to use the Computer Systems on Collaborative Centers and Genotype Imputation features described in this manual, it is needed to aceess from a separate URL than the URL listed above. Please contact us at the following email address for details on the URL for accessing the features above.
Email Address: platform"AT"amed.go.jp (Replace "AT" with "@".)

2.2 Screen Configuration

Each function consists of the following screens. For operating procedures for each screen, see the relevant page.

* Use of Computer Systems on Collaborative Centers and Genotype Imputation are not available simply by creating a login account.
(These features become available after screening data use.)
Therefore, check the Help & Contact screen after screening and logging in for information about how to use these features.

2.3 Session Timeout

Session information will be discarded after 30 minutes of inactivity after login.
If the session times out, login again to use the platform.

2.4 Connecting with the Supercomputer System

The AMED Data Sharing Platform controls whether you can connect to the supercomputer system based on your operating environment.
As shown in the diagram below, Connections to the supercomputer system are available only through the collaborative computer systems.*1
When connecting over the Internet, the [Use of Computer Systems on Collaborative Centers] and [Genotype Imputation] buttons or links will be hidden from view and inaccessible. (Attempting to directly access the URL will result in a system error.)
*1 Please confirm the URL through the collaborative computer systems with your system administrator.

3 Login

3.1 Login

The AMED Data Sharing Platform uses two login authentication methods.
One way is to log in using the authentication system used by the research institution the user belongs to via an authentication platform called GakuNin (https://www.gakunin.jp/). The other is to log in using the login ID issued by the AMED Data Utilization Platform, SecurID.
The authentication system used is set when creating your account. Log into your account using the method specified by AMED. (The authentication method is fixed for each user. You cannot log in using a method other than that specified.)
When logging in using SecurID, the system will behave differently depending on whether this is your first time logging in.

■ Logging in with GakuNin
-> See 3.2 Logging in with GakuNin.
■ Logging in for the first time with SecurID
-> See 3.3 Logging in for the first time with SecurID.
■ Subsequent logins using SecurID
-> See 3.4 Logging in with SecurID (subsequent logins).

3.2 Logging in with GakuNin

3.2.1 Select a [Affiliate Institution]

Select the institution you belong to.
When selecting an institution, the login screen for the selected institution will appear.

3.2.2 Authentication by affiliate institution

On the login screen of the selected institution, perform the login authentication process.
If the institution does not require two-factor authentication on the AMED Data Utilization Platform, login is successful when this authentication process succeeds, and the Dashboard (TOP) screen appears.
When performing authentication with another institution, if this authentication process succeeds you will be taken to the Enter One-Time Password (OTP) screen.

(The Orthros authentication screen is showed here as an example. The actual screen layout and entry fields will vary depending on the institution selected.)

3.2.3 Enter One-time Password

After successfully passing authentication procedures for institutions that requires two-factor authentication, you will be sent an email containing the one-time password to the registered email address. If authentication is successful after entering the one-time password, the Dashboard (TOP) screen will appear.

3.3 Logging in for the first time with SecurID

When logging in for the first time, you will be asked to register a passcode number.
Enter your login ID and the initial passcode number provided to open the Register Passcode Number screen.
Register your own passcode number. * Enter the passcode number registered for subsequent login attempts.
If authentication fails on any screen, you will be returned to the Login screen (Enter ID).

3.3.1 First-time Logins (Enter ID)

Enter your login ID and click the [LOGIN] button.
If authentication is successful, the Enter Passcode Number screen will appear.

3.3.2 First-time Logins (Enter Passcode Number)

After entering in the login ID, the initial passcode number authentication process will begin.
If authentication is successful, the Register New Passcode Number screen will appear.


Once the new passcode number is registered, enter the passcode number again.
*Enter the passcode number registered, not the initial passcode number.
If authentication is successful, the Enter One-time Password screen will appear.

3.3.3 First-time Logins (Register a New Passcode Number)

Enter the passcode number to register, and then click the [REGISTER PERSONAL IDENTIFICATION NUMBER (PIN)] button.
If registration is successful, the Enter Passcode Number screen will appear.
Passcode numbers must satisfy the following requirements to be registered.
・Be 4 to 8 digits long, containing half-width numerals.
・Not match the three most recent passcode numbers used.

3.3.4 First-time Logins (Enter One-time Password)

An email containing the one-time password will be sent to the registered email address.
If authentication is successful, the Dashboard (TOP) screen will appear.
*If you wish to change your email address, please contact your system administrator.

3.4 Logging in with SecurID (subsequent logins)

Enter the passcode number registered when logging in for the first time for subsequent login attempts.
Complete the authentication process using the one-time password sent to the email address registered with the system.
*If you forget your passcode number, please contact your system administrator to reset your passcode number.

3.4.1 Login (Enter ID)

Enter your login ID and click the [LOGIN] button.
If authentication is successful, the Enter Passcode Number screen will appear.

3.4.2 Login (Enter Passcode Number)

Enter the passcode number registered when first logging in, and then click the [SEND PERSONAL IDENTIFICATION NUMBER (PIN)] button.
If authentication is successful, the Enter One-time Password screen will appear.

3.4.3 Login (Enter One-time Password)

Complete the authentication process using the one-time password sent to the email address registered with the system.
Enter the one-time password, and then click the [SEND ONE-TIME PASSWORD(OTP)] button to login.
If authentication is successful, the Dashboard (TOP) screen will appear.

4 Common Features

4.1 Header and Menu

Each screen features a common header and menu.

4.1.1 Header

You can perform the following actions from the header.
・Display the main page of the Japan Agency for Medical Research and Development website in a separate tab.
・Open and close the menu bar.
・Connect to the integrated supercomputer system (only available with a URL connection thorough the collaborative computer systems).
・Switch the language display between Japanese/English
・Display a link to the User Details screen.
・Display the [Logout] button.




4.1.2 Connecting to the Supercomputer

Use this to connect to the supercomputer system.
This option is displayed or hidden based on your connection environment.
・This is displayed when connecting to a URL through the collaborative computer systems.
・This is hidden when connecting to an Internet connection URL.

4.1.3 JA/EN switch

You can switch the screen display between Japanese and English.
* Note that the language will only be changed for labels and other fixed text. Data registered in Japanese will still appear in Japanese, even if the screen display language is changed to English.

4.1.4 Logout

Click the [Logout] button to logout. Logout after your work is complete.

4.1.5 Menu

4.2 Breadcrumbs List

A [breadcrumbs list] showing page navigation is located at the top of each screen to show which screen is currently displayed. Click a link in the [breadcrumbs list] to jump to the corresponding screen.


List of link destinations
Screen nameLink nameLink destination
Common to all screensTOPDashboard (TOP) screen
Search Metadata screenMetadataSearch Metadata screen
Metadata Search Results screenMetadataSearch Metadata screen
SearchSearch Metadata screen
Metadata Details screen (shared data for each)MetadataSearch Metadata screen
SearchSearch Metadata screen
Project List screenProject InformationProject List screen
Project Details screenProject InformationProject List screen
Project ListProject List screen
Data Management screenPre-ResearchData Management screen
Sample Set Registration screenPre-ResearchData Management screen
Sample Set List screenPre-ResearchData Management screen
Variant List screenPre-ResearchData Management screen
Sample Set ListSample Set List screen
Variant Details screenPre-ResearchData Management screen
Sample Set ListSample Set List screen
Variant ListVariant List screen
Analysis Environment Registration screenPre-ResearchData Management screen
Analysis Environment List screenPre-ResearchData Management screen
About Workflow screenGenotype ImputationAbout Workflow screen
Job Configuration screenGenotype ImputationAbout Workflow screen
Job List screenGenotype ImputationAbout Workflow screen
News screenNews & HelpNews screen
Help & Contact screenNews & HelpNews screen

4.3 Sort and Pager

The metadata search results list and project list use a sort function and pager function.

4.4 User Details

Click the [User Details] button in the header to jump to the User Details screen. You can view details on the login user and upload a profile image on the User Details screen. Uploaded image files will appear in the header. The following restrictions apply when uploading files.
Item nameRestriction
File size2 MB or less
File extensionsOnly .jpeg, .jpg, and .png file extensions accepted
Vertical size of file560 px
Horizontal size of file420 px





The registered profile image will appear in the header.

5 Dashboard(TOP)

This screen will appear when first logging on, and when clicking the [TOP] link on the menu.
[1]Area for emergency news
[2]Area for information on news of planned shutdowns and other news
[3]Area for displaying metadata summary information
[4]Area for displaying the Saved Search Conditions List
[5]Area for displaying the project list

5.1 Emergency News

This is used by AMED to issue urgent news to users, such as in the event of a system failure.

5.2 News of Planned Shutdowns and Other News

This is used by AMED to issue news to users on planned service shutdowns for system maintenance, and other issues.
Content displayed is the same as that for the Emergency News display area described in 9.1 News.

5.3 Metadata Summary Information

This displays aggregate graphs of the number of registered metadata entries, along with the [top 10 registered disease names], [metadata entries by gender], [metadata entries by region], and [metadata entries by age]. Click on an element in the graph to perform a metadata search using the selected element as a search condition.

5.4 Saved Search Conditions List

This displays a list of saved metadata search conditions.
If the search condition history is not found, a [No relevant data found.] message will appear.
* For more information on saving metadata search conditions, see 6.1.5 Search Using Saved Search Conditions

5.5 Project List

This displays a list of research projects that the user is involved with.
The contents shown is the same as that described in 7.1 Project List.

6 Metadata

6.1 Search Metadata

Enter search conditions to search for metadata.
The following three search methods are available.
・Searching by entering search keywords.
・Searching previously searched metadata from saved search conditions.
・Searching previously searched metadata from search condition history.
・Only displaying analysis data not tied to a sample.


When detailed conditions are displayed

6.1.1 Search Target Metadata

Metadata handled by the system include the following five types of data according to the JGA data model.
Item nameSummaryDetails
StudyResearch informationData in top-level objects, including study content, research expenses, and publication information
After the data is provided, it is shared publicly to provide an overview of research studies
SampleInformation per sampleSample ≒ Individual
Phenotype data (gender, age, etc.), and anonymous donor IDs
ExperimentExperiment analysis informationExperiment procedures, questionnaires, library information, sequencers, etc.
A single sample is linked to multiple data objects
DataGenome data informationStores (raw) data files (fastq, bam, array data) on individuals
AnalysisData analysis informationStores analysis data of multiple data or sample types
Example: Charts summarizing variant data (vcf) and phenotype data

6.1.2 Keyword Search (Synonyms)

This extracts data for any search target entered as a keyword in the top box.
This excludes data as a target for extraction for any search target entered as a keyword in the bottom box.
Example: When searching for the keyword ”cancer” by entering the word in the top box, data containing the word “cancer” in any search target, such as the title or abstract of a study, will be retrieved.
When entering the keyword ”cancer” into the bottom box, data containing the word “cancer” in any search target, such as the title or abstract of a study, will not be retrieved.
* For details on search target items in a keyword search, see 11.1 Search Target Items in a Keyword Search

If the [Search including synonyms] check box is selected, synonyms of the keyword entered will also be retrieved in a search.
Example: When entering the keyword “Alzheimer’s disease,” data corresponding to synonym keywords such as “Alzheimer’s” and “Alzheimer dementia” will also be retrieved.
If you do not wish to include synonyms in the search results, uncheck the [Search including synonyms] check box before performing your search.
Example: Entering the keyword “Alzheimer’s disease” and unchecking the [Search including synonyms] check box before performing the search will limit search results to data matching the keyword “Alzheimer’s disease,” without retrieving data containing keywords such as “Alzheimer’s” and “Alzheimer dementia.”


6.1.3 Keyword Suggestions

A list of disease names that partially match the keyword entered in the keyword field will appear as suggested keywords.


6.1.4 Check the Number of Search Results

Click the [CHECK THE NUMBER OF SEARCH RESULTS] button to check the number of metadata entries matching the search conditions entered.

6.1.5 Search Using Saved Search Conditions

You can perform a search using previously saved search conditions.
[Number of Search Results] shows the number of matches retrieved when the search was previously saved. If metadata was added or deleted since the search was performed, the number of matches shown may not match the current number of matches in search results.
Click the [DOWNLOAD THE SEARCH CONDITIONS] button to download the file in JSON format.
Click the [DELETE] button to delete saved search conditions.
* For more information on saving search conditions, see 6.2.1 Save Search Conditions

6.1.6 Search Condition History Search

You can perform searches using previously entered search conditions (five most recent search conditions).
Number of Search Results shows the number of matches retrieved when the search was previously performed. If metadata was added or deleted since the search was performed, the number of matches shown may not match the current number of matches in search results.

6.1.7 Searching Analysis Data Not Tied to a Sample

This displays a list of data without a “SAMPLE_REF” entry in Analysis data.

6.2 Metadata Search Results

The Search Metadata screen displays a list of samples tied to extracted data.
Example: When searching for the keyword ”cancer,” matching data samples will appear when the keyword appears in the title, abstract, or other part of a study.

6.2.1 Save Search Conditions

Click the [SAVE THE SEARCH CONDITIONS] button to save up to 10 search conditions for search conditions that are used repeatedly, such as keywords that are frequently searched for. The saved results appear on the Dashboard screen and the Metadata Search screen, and can be used to perform searches with the same search conditions.

6.2.2 Expand Synonyms

When performing a searching with the [Search including synonyms ] check box selected as described in 6.1.2 Keyword Search (Synonyms), you can check synonyms.
You can also search any synonym to search again.
Example: When entering the keyword “Alzheimer’s disease,” keywords such as “Alzheimer’s” and “Alzheimer dementia” will also be used as synonyms. To exclude “Alzheimer dementia” from the search results retrieved, uncheck the “Alzheimer dementia” check box and perform the search again to only retrieve data for “Alzheimer’s disease” and “Alzheimer’s.”


The screenshot below shows an expanded view.

6.2.3 Display Graph

This displays a graph that aggregate the search results.


You can click on an element in the graph to refine search results.
The search condition will be overwritten when the same condition is clicked in the graph.
(For check boxes, a search will be performed as though all conditions other than the selected condition have not been checked)
Example: When clicking Kanto in the Region graph, Kanto will be the only Region search condition selected, and a search will be performed with all check boxes other than Kanto unchecked.
Click the [Re-display of initial search results] button to redisplay the initial results before search results were refined.


6.2.4 Download Metadata Search

You can download metadata search results.
Click the [GENERATE DOWNLOAD DATA] to begin downloading the metadata search results currently displayed, and display a message prompt to accept data generation on the Metadata Search screen.
* Only one file can be generated at a time. The [GENERATE DOWNLOAD DATA] will appear inactive and cannot be clicked while a file is being generated (or has already been generated). If you wish to download another set of search results, either cancel the data generation process if it is in progress, or delete the previously generated data on the Metadata Search screen.

Files are generated in the order the system receives generate file requests. Once the file is generated, it can be downloaded from the Metadata Search screen.
* You can confirm the status of download files being generated by clicking the [Click here to view the status] link on the Metadata Search screen.









If the download file generation has not started, you can click the [CANCEL] button to cancel the generate download data process.
* Once the system starts generating a download file, the [CANCEL] will be hidden from view, and you will not be able to cancel the process.




Once the download file has finished generating, a task complete message will appear. Click the [DOWNLOAD] button to download the file. If the file is no longer needed, click the [DELETE] button to delete the file.
* If you want to generate a new download file, delete the previous file first, and then generate the new file.


6.3 Show Metadata Details - Sample

The Metadata Search Results screen displays sample data tied to the data selected.
*Sample is a general term used to refer to data containing anonymized information about a subject used for research purposes.

6.4 Show Metadata Details - Experiment

The Metadata Search Results screen displays experiment data tied to the data selected.
*Experiment is a general term used to refer to data containing information on experiment procedures, questionnaires, library information, sequencers, and other experiment information used in research.

6.5 Show Metadata Details - Study

The Metadata Search Results screen displays study data tied to the data selected.
*Study is a general term used to refer to research information, including the content of studies, research expenses, and publication information.

6.6 Show Metadata Details - Data

The Metadata Search Results screen displays study data tied to the data selected.
*Data is a general term used to refer to data containing raw data information on experiment results for a specific experiment.

6.7 Show Metadata Details - Analysis

The Metadata Search Results screen displays study data tied to the data selected.
*Analysis is a general term used to refer to data containing analysis result information, including experiment result analysis and sample information analysis.

6.8 Show Metadata Details - Analysis (No Association)

Here you can check a list of analysis data entries not tied to a particular sample. This screen can be reached by clicking the [Click here for Analysis data not associated with Sample (without SAMPLE_REF registration)] link on the Search Metadata screen.

7 Project Information

In this system, “project” refers to “research projects.”

7.1 Project List

This displays a list of research projects that the user is involved with.
Research project information is registered by the system administrator.

7.2 Project Details

You can confirm detailed information on research projects selected on the Project List screen.
This displays project information, biobanks available for use on a project, and researchers participating in a project.

8 Pre-Research

8.1 Data Management

You can upload file for analysis in the analysis environment, and download analysis results.
[1] Analysis Environment Upload... Select the Docker image file of the analysis environment you wish to register, and then click the [UPLOAD] button.
[2] Uploaded Analysis Environment List... This displays a list of Docker image files of registered analysis environments. Any file that is no longer need can be deleted by clicking the [Delete] button.
[3] Data File Upload... Select the data file you wish to register, and then click the [UPLOAD] button.
[4] Data File List... This displays a list of registered data files. Click the [Download] button to download the file. Any file that is no longer need can be deleted by clicking the [Delete] button.
[5] Pre-Research Results File Copy... This copies the results file for a sample set registered on the Sample Set Registration screen. Select the check box next to files you want to copy, and then click the [COPY] button.
[6] Pre-Research Preset File Download... This downloads sample set files registered as system presets. Read to Terms of Use, and then click the [Download] button.

8.1.1 Analysis Environment Upload

To run user-prepared Docker image files on the analysis environment instead of the template Docker image files provided with the AMED Data Utilization Platform, you can upload up to five image files. Uploaded image files will be available for selection when registering the analysis environment.
Uploaded image file will appear in the Uploaded Analysis Environment List. Click the [Delete] icon to delete any unnecessary files.
・ Uploaded image files are deleted automatically if they are not used for a 90-day period.
・ User-created Docker image files may not work properly if they do not follow the rules for creating Docker image files set for the AMED Data Utilization Platform. For more information on the rules on creating image files, see 11.2 Procedure for Using User-created Docker Image Files.
The following restrictions apply when uploading files.
Item nameRestriction
File size2 GB or less
File extensionsExtensions are confined to .tar and .tar.gz
File namesFile name must only contain the following characters
・ Half-width alphanumeric characters [0-9][a-z][A-Z]
・ Half-width exclamation marks [!]
・ Half-width hyphens [-]
・ Half-width underscores [_]
・ Half-width periods [.] (Half-width periods cannot be used at the start of the file name)
・ Half-width at signs [@]
・ Half-width equal signs [=]
・ Half-width commas [,]
・ Half-width carets [^]
・ Half-width sharp marks [#]
・ Half-width round parentheses [()]
・ Half-width curly brackets [{}]
・ Half-width square brackets [[]]

8.1.2 Data File Upload

You can upload files up to 10 GB in size for analysis on the analysis environment.
* The disk space available to the user is 10 GB. Note that files generated as analysis results are also counted towards the file size. The larger the files generated are, the less space there is available to upload files.
The following restrictions apply when uploading files.
Item nameRestriction
File size2 GB or less
Overall file sizeThe total disk space available is 10 GB or less
File namesFile name must only contain the following characters
・ Half-width alphanumeric characters [0-9][a-z][A-Z]
・ Half-width exclamation marks [!]
・ Half-width hyphens [-]
・ Half-width underscores [_]
・ Half-width periods [.] (Half-width periods cannot be used at the start of the file name)
・ Half-width at signs [@]
・ Half-width equal signs [=]
・ Half-width commas [,]
・ Half-width carets [^]
・ Half-width sharp marks [#]
・ Half-width round parentheses [()]
・ Half-width curly brackets [{}]
・ Half-width square brackets [[]]

Upload files and files generated as analysis results will appear in the Data File List.
Click the [DOWNLOAD] icon to download files. Any file that is no longer need can be deleted by clicking the [Delete] icon.
・ Data files are deleted automatically if they are not used for a 90-day period.

8.1.3 Pre-Research Results File Copy

You can use on-demand VCF files generated in sample set registration for analysis by registering these files as data files. On-demand VCF files generated appear in the Pre-Research Results File Copy field. Select the check box next to the file you wish to register, and then click the [COPY] icon. The copied file appears in the Data File List field. This file can now be used as a data file.
To download an on-demand VCF file, copy the file in the same way, and then download it from the Data File List field.
To confirm details of generated on-demand VCF files, click the [List] button to jump to the Variant List screen for the corresponding data. Click on the [Confirm] button to check the conditions for sample set registration.

8.1.4 Download Pre-Research Preset File

You can download the registered overall.vcf file as a preset file. Please read through the Terms of Use when downloading.

8.2 Sample Set Registration

To perform pre-research on a supercomputer, perform sample set registration with specific extraction conditions.
Enter in the fields marked [1] to [4] in the screenshot below and confirm the details provided in [5]. Next, click the [ADD] button to display the Register Sample Set Name dialog box.
Register a sample set name of your choosing in the dialog box to send a processing request to the supercomputer.
* Only one sample set can be registered at a time. You cannot register a new sample set while processing tasks. A maximum of 10 sample sets can be registered. To register additional sample sets beyond this, delete a sample set that is no longer required.
For details on how to confirm the status of supercomputer processes, and how to view the extraction results, see 8.3 Sample Set List.

[1] Select the biobank subject to data extraction. A biobank must be specified.
A processing results file is generated for each biobank. Specifying three biobanks will generate three files.
[2] Specify the disease names to be included and the disease names to be excluded in the data extracted.
For details on how to specify diseases, see 8.2.1 Disease Setting
[3] Set the attributes for data extraction. At least one disease or attribute must be specified as an extraction condition.
[4] Set the area for data extraction. The area setting is required.
For details on how to specify an area, see 8.2.2 Area Setting
[5] When specifying extraction conditions, the number of matching metadata items appears at the bottom of the screen. Note that sample set registration cannot be performed if the number of samples for the biobank specified is less than 10. Modify the condition settings so that the number of samples is 10 or more.

8.2.1 Disease Setting

Search and specify diseases either by [SEARCH BY KEYWORDS] or [SEARCH BY DISEASE CLASSIFICATION].
[1] Search by Keywords
Use the same synonym search method described in 6.1.2 Keyword Search (Synonyms) to search for disease names and specify diseases.

[2] Search by Disease Classification
Select the category of disease you wish to specify from those available, and then specify the disease.

8.2.2 Area Setting

Specify either Gene, refSNP, chrom, or position as the area to be extracted.
[1] Gene: Specify HIRA, etc.
[2] refSNP: Specify rs2283354, etc.
[3] chrom, position: Specify as the chrom value + “:” + the position value such as 22:111770161, etc.
[4] chrom, position area setting: Specify as the chrom value + “:” + position value (from) + “-“ position value (to), such as 10:71510986-71617219, etc.

Only values contained in VCF files registered as preset files can be specified.
* On-demand VCF files are created from data extracted from the preset file. Target data for extraction will not be found if you specify a value that does not exist in the preset file, and sample set registration cannot be performed.

8.3 Sample Set List

This shows a list of registered sample sets.
Use this to confirm the execution status of sample sets registered on the Sample Set Registration screen, and details of sample sets that have been executed by jumping to the Variant List screen.
You can also delete files and sample sets that are no longer needed.






8.3.1 View Sample Sets

The list contains sample sets that come pre-registered as presets, and sample sets registered by the user.
・ Presets appear with the name “Basic Sample Set for Data Utilization Platform.” Due to the large volume of data, the Variant List screen can take over a minute to display.
・ Files are automatically deleted 30 days after they are generated. Following this, they will no longer appear on the Variant List screen.

8.3.2 Confirm Execution Status

The Status column in the list updates according the processing status on the supercomputer. When the status changes to “Registered,” the Variant List screen can be displayed.

8.3.3 Using Sample Sets

You can confirm the extraction conditions at the time of registration, and delete any unnecessary data using [File Delete] or [Sample Set Delete].
・ [Confirm]: This displays the extraction conditions. Click the [Register new sample set] button on the View Conditions screen to jump to the Sample Set Registration screen with the conditions set to perform sample set registration again.
・ [File Delete]: While this shows extraction conditions and other sample set data, you will not be able to jump to the Variant List screen.
・ [Sample Set Delete]: This deletes all sample set data and removes the sample set from the list.

8.4 Variant List

This shows a list of variants for the sample set selected on the Sample Set List screen.
・If a variant does not exist in a biobank, the Alt allele frequency will display "N/A".
・If a variant exists in a biobank, but the number of its samples is too small to reach the minimum number needed for frequency information, the Alt allele frequency will display "N/A(※)".

Enter keywords in the following format to filter by data type.
[1] Enter a value in xx:yyyyyyyyy format -> Filter by chrom and position.
[2] Enter a value in xx:yyyyyyyyy-zzzzzzzzz format -> Filter by chrom and position range.
[3] Other values -> Filter by Gene, or refSNP.


8.5 Variant Details

This shows details of variants selected on the Variant List screen.

For a basic sample set, click the Disease, Gender, Age, Region buttons under Frequency at the bottom of the screen to show/hide each type of information for each biobank.


When Disease results are shown, you can filter results by a partial match of the disease name.

8.6 Analysis Environment Registration

Specify a Docker image and start the analysis environment.
Enter the fields [1] through [6] described below, and then click the [START] button to start the Docker image in the analysis environment.
* Up to 4 GB is available for the CPU and 28 GB is available for memory at the same time. When running multiple instances of the analysis environment at the same time, take care to avoid exceeding these limits.
For information on how to view the startup status of the analysis environment, and accessing Notebook while running, see 8.7 Analysis Environment List.

[1] Enter the name of the analysis environment.
[2] Specify a template image or a user-created image as the Docker Execution Image.
・ Template: Image used to start jupyter notebook. You can access the Notebook started from your browser.
・ User-created: A user-created image registered on the Data Management screen (see 8.1.1 Analysis Environment Upload)
[3] When using a user-created image, enter the user name used in the user-created image.
For the Container User Name, enter a user name other than “root,” “sys,” or “adm.”
[4] Select the CPUs (1 to 4) used in the analysis environment.
[5] Select the amount of memory (1 GB to 28 GB) used in the analysis environment.
[6] Select the Auto Stop Time (10 minutes to 300 minutes) for the analysis environment.
The analysis environment will stop automatically, even if it is in the middle of processing tasks, once this amount of time passes from when the analysis environment is started.

8.7 Analysis Environment List

This displays a list of analysis environments registered on the Analysis Environment Registration screen.
Use this to view the startup status of analysis environments, access the Jupyter Notebook, and restart analysis environments that have completed tasks for execution.
You can also delete analysis environments that are no longer needed.

8.7.1 View Analysis Environments

This displays a list of registered analysis environments.
You can download the analysis result file from the Data Management screen (see 8.1 Data Management).
・ The analysis environment, and data files saved as analysis results are deleted automatically if they are not accessed over a 90-day period.

8.7.2 Using Analysis Environments

・ When starting a template image, the [Access Notebook] button will appear. Click this button to access the Jupyter Notebook running in another tab to make changes to the analysis.
・ If the analysis environment is running, the [STOP] button will appear. Click this button to stop the analysis environment.
・ If the analysis environment is stopped, the [START] button will appear. Click this button to restart the analysis environment.
・ If the analysis environment is stopped, the [Delete] button will appear. Click this button to delete the analysis environment.
・ In the Container Log column, click the [Confirm] button to view the container log when the analysis environment is executed.
 Logs can only be viewed when the analysis environment is stopped.

8.7.3 Notes when using Jupyter Notebook

For information on notes to consider when using Jupyter Notebook, see 11.3 Procedure for Using Template Docker Image Files.

9 News & Help

9.1 News

Use this to check important news, and other news.

9.2 Help & Contact

This displays manuals and contact information.
Click each button to view the corresponding manual in a separate tab.

10 Application for Data Utilization

Click the [Application for Data Utilization] link in the menu to open the electronic review system in another tab.


11 Supplement

11.1 Search Target Items in a Keyword Search

Items subject to keyword searches on the Search Metadata screen are summarized below.
CategoryItem
StudyTitle
Abstract
Attributes/Value
Grants/Title
Grants/Agency
Grants/Abbr
Publications/Reference
Study Type/Study Type
ExperimentTitle
Design Description
SampleTitle
Sample Group Type
Description
Disease Name
icd_code
Attributes/Value
AnalysisTitle
Description
Attributes/Value

11.2 Procedure for Using User-created Docker Image Files

This chapter describes the procedure for creating Docker images. In this section we look at building a Docker image by creating the Docker file that describes the build procedure for the Docker image and the script executed when starting the Docker (entrypoint.sh) in the user’s environment. To create the Docker image described in this Manual, you must have the Docker installed in the user’s environment. Note that the building of a Docker environment in the user’s environment falls outside the scope of this Manual.

While the AMED Data Utilization Platform allows users to upload a Docker image and run the Docker container as an analysis environment, there are restrictions on external connections to and from the executed Docker container. As such, users cannot directly log into a user-created Docker image. Analysis processes and other processes are set when creating the Docker image.

11.2.1 Creating the Docker image

[1] Creating the Dockerfile
The Dockerfile is used to specify the base image, install the required packages, set environment variables, and describes files to add and commands to execute, etc.
The Dockerfile primarily uses the following commands. Take note of the following when creating the Dockerfile.

FROM
Specifies the base image for the Docker image. As such, the Dockerfile must start with the FROM command.
The AMED Data Utilization Platform only runs Linux-based Docker images. Windows-based Docker images cannot be used.
 [Format]
 FROM Image Name [:Tag] 

USER
Sets the user executing subsequent RUN and ENTRYPOINT commands.
The AMED Data Utilization Platform does not allow the use of root, sys, or adm in the Docker container. As such, root, sys, and adm cannot be specified as the ENTRYPOINT or CMD user when performing analysis.
 [Format]
 USER User[:Group]  

COPY
Specify the file and directory to add to the Docker image in the copy source to add the file and directory to the copy destination on the Docker image. The copy source must be located in the build context (a set of files and directories used to build the Docker image. In this Manual, this is set to the current directory used to execute the build command).
 [Format]
 COPY Copy source Copy destination on the Docker image

RUN
This describes package installation and configuration commands executed when building the Docker image.
 [Format]
 RUN command

The uploaded Docker image automatically executes analysis processes using scripts, etc. in the Docker container and outputs the analysis results to an accessible storage area. Users can retrieve the analysis results file via the AMED Data Utilization Platform application.

Specify install Mountpount for Amazon S3 in the Dockerfile RUN command to allow access to the storage area from the Docker container and upload files. The procedure for installing Mountpoint for Amazon S3 varies depending on the Linux distribution in use. For more information, see the Amazon Simple Storage Service (S3) User Guide.

Example RUN command for installing Mountpoint for Amazon S3 (using Ubuntu)
 # Install mount-s3
 USER root
 RUN apt-get update
 RUN apt-get install -y wget
 RUN wget https://s3.amazonaws.com/mountpoint-s3-release/latest/x86_64/mount-s3.deb 
 RUN apt-get install -y ./mount-s3.deb
 RUN rm -rf ./mount-s3.deb

ENTRYPOINT
Specify the command or script to execute when starting the Docker container.
Only one ENTRYPOINT command can be added to the Dockerfile. To run analysis processes automatically, use the ENTRYPOINT command, or the CMD command. The procedure used to specify an analysis process in the ENTRYPOINT command is described in “[2] Creating a script to be executed when starting the Docker (entrypoint.sh)” in this Manual.
 [Format]
 ENTRYPOINT command
 Or

Specify the script in the ENTRYPOINT command (entrypoint.sh)
 ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]

CMD
Executes the command specified when the Docker container starts. Note that if ENTRYPOINT is described in the Dockerfile, it will be interpreted as an ENTRYPOINT argument.
 [Format]
 CMD command
 Or
The following example shows the entire Dockerfile.
 FROM ubuntu

 # Install mount-s3
 USER root
 RUN apt-get update
 RUN apt-get install -y wget
 RUN wget https://s3.amazonaws.com/mountpoint-s3-release/latest/x86_64/mount-s3.deb  
 RUN apt-get install -y ./mount-s3.deb
 RUN rm -rf ./mount-s3.deb

 COPY entrypoint.sh /usr/local/bin/entrypoint.sh
 RUN chmod +x /usr/local/bin/entrypoint.sh



[2] Creating a script to be executed when starting the Docker (entrypoint.sh)
In the description provided in this chapter, entrypoint.sh is specified in the ENTRYPOINT command in the Dockerfile, and entrypoint.sh describes the analysis process executed in the Docker container.

・ How to use data files for analysis
In the AMED Data Utilization Platform, the storage area is mounted after the Docker container starts up.
Uploaded data cannot be used or saved until this mounting process is completed.
To prevent the analysis from starting before the mounting process is complete, make sure to set a standby time (60 seconds) before the analysis process begins.

Specify standby time for mounting process (in entrypoint.sh)
 # Waiting for mount process 
 sleep 60s (provisional)

When the mounting process is complete, the data file uploaded from the Data Management screen in the AMED Data Utilization Platform is placed in the following storage location in the Docker container. (*)

Data file storage location
 /home/[User (set when starting the analysis environment)]/s3/[Name of the uploaded data file]

As updating and saving data files on the mounted user storage area (/home/[User (set when starting the analysis environment)]/s3/) have not been tested, make sure to move data files to the Docker container before use.

Example command for moving data files to the container during use
 cp /home/[User (set when starting the analysis environment)]/s3/[Name of the data file in use] [Container directory]


You can retrieve and save files output in the analysis process by storing them in the user storage area (/home/User (set when starting the analysis environment)]/s3/). (*)
If the analysis environment is stopped (the Docker container is stopped) without storing output files in the user storage area, the data will be lost and cannot be retrieved when starting the analysis environment up again. Set a command in the analysis process script to store files you wish to save in the user storage area.

Example of command for storing data files
 cp [Container directory]/ [Save file name] /home/[User (set when starting the analysis environment)]/s3/

* For more information on the procedure for uploading and downloading data from the AMED Data Utilization Platform, see 8.1 Data Management.

・Preventing the Docker container from being restarted unintentionally
When running an analysis process, the Docker container will stop when the main process is finished.
When the Docker container stops automatically instead of a stop command issued by the application in the AMED Data Utilization Platform, the Docker container may restart to keep it running.
If this happens, the results of the analysis process performed may be overwritten. To prevent the container from restarting unintentionally, write a command in entrypoint.sh or the Dockerfile to prevent the Docker container process from finishing after the analysis process. Make sure to terminate the Docker container from the application.
An example of a command specified in entrypoint.sh to prevent the Docker container from restarting after the analysis process finishes is provided below.
For more information on the procedure for terminating the Docker container, see 8.7 Analysis Environment List.

・Example of a command added to entrypoint.sh to prevent the Docker container from restarting
 # End analysis process

 # Execute commands to hold processes 
 tail -F /dev/null 

You can confirm the standard output and error output status for processes executed in the container after the container is started in the container log. View the container log to check the status of container operations. For more information on the procedure for viewing the container log, see 8.7 Analysis Environment List.

11.2.2 Uploading and downloading data for analysis

This chapter describes how to upload and download data for analysis when using an uploaded Docker image.

[1] Uploading data for analysis
Starting the analysis environment using the Docker image uploaded by the user will prevent users from externally connecting to and from the Docker container. As such, users cannot directly access the executed Docker container. When preparing a Docker image in advance, data files to be used and output must be specified in scripts, etc.
Upload data files to use from the Data Management screen.

The data file uploaded from the Data Management screen in the AMED Data Utilization Platform is placed in the following storage location, as viewed from the Docker container.

・Data file storage location
 /home/[User (set when starting the analysis environment)]/s3/[Name of the uploaded data file]

[2] Downloading data for analysis
You can retrieve and save files output in the analysis process by storing them in user storage (/home/User (set when starting the analysis environment)]/s3/). If the analysis environment is stopped without storing output files in user storage, the data will be lost and cannot be retrieved when starting the analysis environment up again. Set commands to store files you want to save in user storage in scripts that run automatically in advance.

・Example of command for storing data files
 cp [Container directory]/ [Save file name] /home/[User (set when starting the analysis environment)]/s3/

Files saved in the analysis environment will appear in the Data File List on the Data Management screen.

11.3 Procedure for Using Template Docker Image Files

Use a template Docker image to display and use the Jupyter Notebook screen in the analysis environment started in 8.6 Analysis Environment Registration.
In the Analysis Environment List screen, click the [Access Notebook] button for the analysis environment connecting to the web browser.




11.3.1 Uploading and downloading data for analysis

This chapter describes how to upload and download data for analysis when using a template image for analysis that comes prepared with the AMED Data Utilization Platform.

[1] Uploading data for use



[2] Accessing running analysis environments



[3] Displaying the Jupyter Notebook screen


[4] Confirming upload files




[5] Moving upload files
Data updates and saves are not guaranteed to work in the s3 folder. Make sure to move files to the container for use.





[6] Saving analysis results
To retrieve analysis results output on the container, store the results in the s3 folder to allow the results to be downloaded from the AMED Data Utilization Platform application.
The total size of files that can be saved to the s3 folder is 10 GB. When exceeding 10 GB, new files cannot be stored unless existing files are deleted.





Files saved in the analysis environment will appear in the Data File List on the Data Management screen.


11.3.2 Stopping an analysis environment

[1] Confirm running analysis environments
From the top menu of the AMED Data Utilization Platform, select [Pre-Research], and then [Analysis Environment List].


[2] Stopping an analysis environment
Select [STOP] for the running analysis environment that is no longer in use.
Stopping the container will delete all data for analysis that has not been stored. If you have data that you need to save, see 11.3.1 Uploading and downloading data for analysis.



12 FAQ

NoQuestionAnswer
1What if I forget my passcode number?Please reset the passcode number, and then register a new passcode number.
Ask your system administrator to reset your passcode number.
2How do I change the email address that one-time passwords are sent to?As this requires updates to the user information, ask your system administrator to change the email address.
3What do I do if the supercomputer connection link does not appear on the screen?Only connections through collaborative computer systems allow users to connect to the supercomputer.
Check whether you are using the URL for connecting via the collaborative computer systems.
4What do I do if old search condition histories do not appear on the Search Metadata screen?The search condition history only shows the five most recent history entries. Older search histories will not appear, so you will need to re-enter the search conditions to perform the search again.
5What if I participate in an increased number of research projects?As this requires updates to the user information, ask your system administrator to add research projects.