CURATOR HELP CENTER
Step 3: Uploading Metadata
Below are listed the instructions for uploading metadata to your dataset.
Uploading metadata
If you choose to upload an anonymised metadata file, users who will browse your dataset will be able to access a query with data preview that you are offering. This will give a better understanding of what you are offering, but note that this step is optional.
Item count
The amount of data items should be entered as well to provide the collaborators with information about the size of the available dataset.
Metaset file
A metaset file contains descriptive information about the data items that can be used to query subsets of real datasets. You can upload metadata as any file in a structured data format such as CSV, JSON, Excel etc.

CSV and JSON are the supported file formats in Curator and will allow you to immediately build a query and publish the dataset. However, if the CSV or JSON file is faulty or data is not available in these formats, any other file format such as .xls, .xlsx, .xml is acceptable. In the latter case the Longenesis team will provide assistance with structuring and uploading the metaset file.
CSV format
If the CSV format has been selected for the metaset file, there are a couple of preconditions to consider regarding its structure.

1. Each column included in the CSV file will generate a separate parameter in the Query UI builder.

2. If there are multiple values in a singe column, they must be separated by a comma in order for them to be to recognized as separate items.

3. Dot-decimal notation must be used to separate the integer parts from fractional parts of numbers written in decimal form.
For example, using "." in such decimal numerals as 12.34, 75.6, etc.

4. ISO date format yyyy-mm-dd or yyyy-mm-dd hh:mm:ss is the supported date and time format.
ISO 8601 represents date and time by starting with the year, followed by the month, the day, the hour, the minutes, seconds and milliseconds.
For example, 2020-07-10 15:00:00.000, represents the 10th of July 2020 at 3 p.m.
5. In the section Query UI builder the filter type "Range" is automatically recognized if a column is numeric and there are at least three unique values in it.
CSV format variations
There are several different variants of the CSV format with some differences between them depending on the way they are designed. Essentially, these variations enable users to choose from several slightly differently built data formats to match the various ways how other programs are built up to ensure compatibility and successful data import and export. The different CSV variants that are supported by Excel are described below.

CSV UTF-8 (Comma delimited)* uses the Unicode Transformation Format, 8 Bit character encoding. This format supports accented characters and non-English alphabet characters, e.g., ones with diacritic marks.
*This format is recommended to be used in order not to lose any information.

CSV (Comma delimited) is the default CSV format and uses commas as field separators and double-quotes as text delimiters and does not support accented and non-English alphabet characters, e.g, diacritic marks. This format uses the CR/LF as the new-line character, the CRLF characters (CR - Carriage Return, LF - Line Feed) being used to separate individual data sets or rows.

CSV (Macintosh) uses the CR as the new-line character, the CRLF characters (CR stands for Carriage Return, LF stands for Line Feed) being used to separate individual data sets or rows.

CSV (MS-DOS) uses the CR/LF as the new-line character, the CRLF characters (CR stands for Carriage Return, LF stands for Line Feed) being used to separate individual data sets or rows.
JSON format
The uploaded data is stored in a JSON file containing one array with all the items. Item keys and values are not specified, as long as the JSON file is valid. Supported values are string, number, boolean and array of strings. Nested objects cannot be used in a metadata file.
[{
	"gender": "Woman",
	"age": 49,
	"height": 176.5,
	"weight": 78.6,
	"bloodTests": ["BMP", "CMP", "Lipid"],
	"conditions": ["Diabetes", "Hypertension", "Asthma", "Epilepsy"],
	"genetics": [],
	"habits": ["Use of Alcohol"],
	"medicalHistory": ["Medication"],
	"otherExaminations": ["Urine"]
},
...
]
Data sample file
We also provide you with the opportunity to add a sample file to your dataset that would contain no real dataset information but would mirror its structure. This helps collaborators understand the way you structure your data and how it is applicable for potential research. The file format is not specified as long as no real data are published in the file.