Best Practices
In order to fully utilize the platform we have prepared some guidelines and suggestions for you to follow.
Storage considerations
Each Run saves a snapshot of your Task code, so be careful about the files you upload to it to avoid wasting space on those snapshots.
Coretex Runs are lightweight and meant to primarily hold Python code. If you need to store larger static auxiliary files, either store them in a Coretex Dataset or on an external public location and use Coretex Cache.
Preprocessing inputs
Datasets - sometimes they are small, but often at times, they are tremendously large in size. It becomes very challenging to process the datasets which are very large. For this reason, preprocessing the dataset prior to running a task on it becomes important.
This means you sometimes need to apply certain transformations on the dataset before you can start using it for a large number of run iterations. This includes selection of only certain attributes of the dataset (known as Feature Selection) which can reduce the time of the training (run) performed on that dataset.
Consider storing your transformed dataset on the platform (Upload Dataset) and using it instead of the non-processed one for all of the following runs.
For further information refer to the Protein Stability Template to see an example of storing the expensive-to-compute protein sequence embedding in a separate dataset, which shows how they can be used for running the task to train a network which predicts the stability of the proteins.
Debugging task code
When running the code (i.e. executing a run), most of the logs will be available in the run console as default. If you want to see more detailed logs, all you have to do, prior to executing the run, is import the python standard logging library inside your task code files and use it as you would in any Python script.
All logs are saved in .coretex/logs directory of your node, in files sorted by date. In addition to your logs, these files may also contain API logs to further aid the debugging process.
Last updated