📚Dataset
Last updated
Last updated
A Dataset is a named collection of Samples (i.e. images, IMU data, tabular data), belonging to a single Project Type of Samples within the Dataset are determined by the Type of the Project it belongs to. For instance, a Dataset for Computer Vision would contain images, while Dataset for BioInformatics would have Sample files of DNA sequences.
There are 3 ways of uploading data to the Coretex platform:
Uploading dataset through Web UI,
Uploading dataset using CLI
Uploading dataset using a custom Python script
The simplest way of uploading data to the Coretex platform is by using the Web UI. Once a Project is created you can upload data directly to Coretex.
After you create a Project the next step is to create the Dataset by navigating to the Datasets screen and clicking on + New Dataset button on the top left panel. Once the form fields are populated you will select your desired data source from the three mechanisms Coretex supports.
Upload data from Coretex sample dataset, Click on the Load sample dataset button and select one dataset from a list of sample datasets, and after selecting the desired dataset click on Use Sample button.
Import data from a local path on your machine, Click on the Upload File link or drag and drop your files to upload.
Capture live data using your camera. Click on the camera selection box and and you will be presented with the interface for capturing images from your web camera. (only available for Computer Vision Task)
Once you have reviewed the dataset click the Create Dataset button
The most secure way to upload a big dataset to Coretex is by using the Coretex CLI tool. Before using the tool, you must first download and install the Coretex CLI package. After installing the CLI tool instructions will walk you through configuration of the tool, which is required prior to use. Follow the link below for further information on installation, and refer to the Import Data section of the tutorial for the dataset upload examples.
In order to upload dataset through the Python script you need to follow these steps:
Install coretex
dependency in your virtual environment with "pip install coretex"
.
Import the NetworkManager
from coretex.networking
. Now use the NetworkManager
to authenticate to the Coretex platform.
Import the CustomItem
(data type of the Sample is determined by Space Task) from coretex
and create a custom function for creating the sample using the function sample.createSample
.
Once you have gathered a list of files you want to upload to Coretex, import the MultithreadedDataProcessor
from coretex.threading
. This will allow you to upload up to 10 items at the same time.
Initiate the MultithreadedDataProcessor
, populate the required parameters and run the process
.
Approach | Pros | Cons |
---|---|---|
Web UI
Simple
Not reliable for large datasets due to browser limitations (need to stay on the same page for the entirety of the upload process)
CLI
Good for large datasets
Requires installing Coretex CLI tool
Python
Fastest upload
Requires Python skills