Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
ps-conda-environment/README.md
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
163 lines (99 sloc)
6.15 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# ps-conda-environment | |
Setting up Anaconda environment for Plant Science data pipelines | |
- Manual by Sungchan Oh (Sun) | |
- First Created: November 13, 2024 | |
- Last Updated: November 13, 2024 | |
- Contact: oh231@purdue.edu | |
- Prerequisite | |
- Access to Purdue Compute Resources (e.g., [Negishi](https://www.rcac.purdue.edu/compute/negishi)) | |
- Terminal software ([MobaXterm (Win)](https://mobaxterm.mobatek.net/), [iTerm2 (mac OS)](https://iterm2.com/), [Terminator (Linux)](https://gnome-terminator.org/)) | |
___ | |
## Three ways to get your environment ready for plant science data pipelines | |
1. [Manually create Anaconda environment](#1-manually-create-anaconda-environment) | |
- Construct an Anaconda environment step-by-step, customizing it to specific requirements. | |
- This method offers granular control but requires careful package management. | |
- Note: This approach has been successfully tested on Negishi (Linux-64) systems. | |
2. [Create Anaconda environment using `environment.yml`](#2-create-anaconda-environment-using-environmentyml) | |
- Leverage a `environment.yml` file to define the environment's dependencies. | |
- This approach is more automated and reproducible, especially for those using Negishi. | |
- Note: This method has also been tested on Negishi (Linux-64) systems. | |
3. [Use prebuilt Apptainer/Docker container](#3-use-prebuilt-apptainerdocker-container) | |
- Utilize a pre-configured Apptainer/Docker container encapsulating the entire pipelines and the environment. | |
- This method is convenient for immediate use and portability. | |
- It eliminates the need for manual environment setup and ensures consistency across different systems. | |
## Important Considerations | |
- Environment Compatibility: While these methods have been tested on Negishi (Linux-64), compatibility with other systems may vary. Users should be aware that successful installation and execution may depend on specific system configurations and package versions. | |
- Anaconda and Python Version Compatibility: If manual environment setup is preferred, it's crucial to identify compatible versions of Anaconda and Python that support the required Python packages. | |
- Apptainer/Docker Container Adoption: As the pre-built Apptainer container is under development, it's recommended to stay updated on its availability and usage instructions. This approach promises a simplified and standardized environment for running plant science data pipelines. | |
By carefully considering these factors, users can effectively establish the necessary environment for their plant science data analysis tasks. | |
## 1. Manually create Anaconda environment | |
- Launch a terminal emulator and connect to the Negishi login node. | |
```bash | |
ssh [your_id]@negishi.rcac.purdue.edu | |
``` | |
- Replace `your_id` with your actual `Purdue ID` (e.g., `oh231`). | |
- Remove the square brackets from all placeholders. | |
- For the remaining placeholders like `["something"]`, replace `something` with the appropriate value and remove square brackets. | |
- Load Anaconda moduele and create a new Anaconda environment. | |
```bash | |
module load anaconda/2024.02-py311.lua | |
conda create --name ps37 python=3.7.9 -y | |
``` | |
- Note: Python 3.7 is highly compatible with the Python dependencies used in Plant Science pipelines. | |
- Activate the new Anaconda environment. | |
```bash | |
conda activate ps37 | |
``` | |
- Install Python dependencies. | |
```bash | |
conda install -c conda-forge ipython numpy gdal pandas matplotlib -y | |
conda install -c conda-forge laspy lastools rasterio fiona python-pdal pdal -y | |
conda install -c conda-forge asteval pyproj scikit-image spectral opencv -y | |
``` | |
- Installation may take several minutes. | |
## 2. Create anaconda Environment Using `environment.yml` | |
- Launch a terminal emulator and connect to the Negishi login node as instructed in [1. Manually create Anaconda environment](https://github.itap.purdue.edu/PlantScience/ps-conda-environment?tab=readme-ov-file#1-manually-create-anaconda-environment). | |
- Load Anaconda moduele | |
```bash | |
module load anaconda/2024.02-py311.lua | |
``` | |
- To proceed, either navigate to the directory containing the `environment.yml` file or download the file to your current working directory. | |
- Create a new Anaconda environment and install required dependencies using the following command: | |
```bash | |
conda env create -f environment.yml | |
``` | |
## 3. Use prebuilt Apptainer/Docker container | |
- TODO | |
## Verify `laspy` dependency and potential workaround | |
- Run Python and import `laspy`. | |
```python | |
import laspy | |
``` | |
- If the `laspy` import fails, the following steps detail a potential workaround to address potential incompatibility issues between laspy and the queue module. | |
- Locate `copc.py` (Environment-Specific): The exact path to `copc.py` may vary depending on your Python installation and environment. We recommend the following general guidance. Navigate to your active Python environment's site-packages directory using the command `conda env list` or `which python` to find the Python executable and its associated site-packages path. | |
- Common locations for the site-packages directory include: | |
```bash | |
~/.conda/envs/<module_name>?/<env_name>/lib/python<version>/site-packages | |
``` | |
- Once in the site-packages directory, search for the `laspy` package and locate the `copc.py` file within its subdirectories. Example: | |
```bash | |
~/.conda/envs/2024.02-py311/ps37/lib/python3.7/site-packages/laspy/copc.py | |
``` | |
- Edit `copc.py` (with Caution :exclamation:) | |
- Warning: Improper modification of system files can lead to unintended consequences. Proceed with caution and back up the file before making changes. | |
- Assuming you've found `copc.py`, open it in a `text editor (Notepad, vim, etc.)`. Locate the following line: | |
```python | |
from queue import Queue, SimpleQueue | |
``` | |
- Comment out this lines with `#`. Here's the modified code: | |
```python | |
#from queue import Queue, SimpleQueue | |
``` | |
- Before or after the commented out line, add the following code: | |
```python | |
from queue import Queue | |
``` | |
- Run Python and import laspy | |
```python | |
import laspy | |
``` | |