📚Dataset

A Dataset is a named collection of Samples (i.e. images, IMU data, tabular data), belonging to a single Project Type of Samples within the Dataset are determined by the Type of the Project it belongs to. For instance, a Dataset for Computer Vision would contain images, while Dataset for BioInformatics would have Sample files of DNA sequences.

Dataset upload

There are 3 ways of uploading data to the Coretex platform:

  • Uploading dataset through Web UI,

  • Uploading dataset using CLI

  • Uploading dataset using a custom Python script

Uploading dataset through Web UI

The simplest way of uploading data to the Coretex platform is by using the Web UI. Once a Project is created you can upload data directly to Coretex.

After you create a Project the next step is to create the Dataset by navigating to the Datasets screen and clicking on + New Dataset button on the top left panel. Once the form fields are populated you will select your desired data source from the three mechanisms Coretex supports.

  1. Upload data from Coretex sample dataset, Click on the Load sample dataset button and select one dataset from a list of sample datasets, and after selecting the desired dataset click on Use Sample button.

  2. Import data from a local path on your machine, Click on the Upload File link or drag and drop your files to upload.

  3. Capture live data using your camera. Click on the camera selection box and and you will be presented with the interface for capturing images from your web camera. (only available for Computer Vision Task)

Once you have reviewed the dataset click the Create Dataset button

Uploading dataset using CLI

The most secure way to upload a big dataset to Coretex is by using the Coretex CLI tool. Before using the tool, you must first download and install the Coretex CLI package. After installing the CLI tool instructions will walk you through configuration of the tool, which is required prior to use. Follow the link below for further information on installation, and refer to the Import Data section of the tutorial for the dataset upload examples.

🖥️Coretex CLI

Uploading dataset using a custom Python script

In order to upload dataset through the Python script you need to follow these steps:

  1. Install coretex dependency in your virtual environment with "pip install coretex".

    % pip install coretex
  2. Import the NetworkManager from coretex.networking. Now use the NetworkManager to authenticate to the Coretex platform.

    from coretex.networking import NetworkManager
    NetworkManager.instance().authenticate("demo@biomech.us", "Ncb18H£36Pz/")
  3. Import the CustomItem(data type of the Sample is determined by Space Task) from coretex and create a custom function for creating the sample using the function sample.createSample.

    from coretex import CustomItem
    def createItem(path: str) -> CustomItem:
        item = CustomItem.createCustomItem(Path(path).stem, 3304, path)
        if item is None:
            pass
  4. Once you have gathered a list of files you want to upload to Coretex, import the MultithreadedDataProcessor from coretex.threading. This will allow you to upload up to 10 items at the same time.

    (class) MultithreadedDataProcessor(data: list, singleElementProcessor:
    (Any) -> None, threadCount: int = 1, title: str | None = None)
  5. Initiate the MultithreadedDataProcessor , populate the required parameters and run the process.

    from coretex.threading import MultithreadedDataProcessor
    processor = MultithreadedDataProcessor(
        listOfFiles,
        createItem,
        threadCount = 8,
        title = "createItem"
    )
    processor.process()

Pros and Cons of each Upload Method

ApproachProsCons

Web UI

Simple

Not reliable for large datasets due to browser limitations (need to stay on the same page for the entirety of the upload process)

CLI

Good for large datasets

Requires installing Coretex CLI tool

Python

Fastest upload

Requires Python skills

Last updated