You want to make live changes to Daft, and test these changes on a Ray cluster.
Prerequisites:
You must be developing Daft on an OS / architecture that is the SAME as your Ray cluster. (The easiest way to do this is to develop Daft on an EC2 dev box with the SAME OS / architecture).
A ray cluster.
Add this to the base Cargo.toml. This is because we need the .so
file to be < 100mb. This is a workaround until we can make our builds smaller.
[profile.release] strip = true
panic = ‘abort’
Steps:
Make changes to Daft.
Build daft in release: make build-release
*Make sure the .so
file is < 100mb.
Create a Python script to run your experiment / test.
Put the script in an empty directory.
SymLink the directory to the Daft directory :
ln -s ../daft daft
.Create a runtime_env.yaml file
pip:
packages:
- pyarrow
- numpy
- tqdm
- fsspec
pip_check: false
env_vars:
PYTHONPATH: .
Submit the job to the ray cluster:
ray job submit --address “[<http://localhost:8265>](<http://127.0.0.1:8265/>)” --runtime-env=”runtime_env.yaml” --working-dir “working_dir” -- python3 test.py