Simple
For all of these examples, we will use the Prop3D-20 dataset:
export HS_USER=None
export HS_PASSWORD=None
export HS_ENDPOINT=http://prop3d-hsds.pods.virginia.edu
export PROP3D_DATA=/CATH/Prop3D-20.h5
Get a single protein from the dataset with python (h5pyd)
import os
import h5pyd
with h5pyd.File(os.environ["PROP3D_DATA"], use_cache=False) as f:
domain = f["2/30/30/100/domains/1kq2A00/atom"][:]
... #Code to process domain
Analyze a single domain using Prop3D
We recommend using DistributedStructure <https://github.com/bouralab/Prop3D/blob/main/Prop3D/common/DistributedStructure.py> and/or DistributedVoxelizedStructure <https://github.com/bouralab/Prop3D/blob/main/Prop3D/common/DistributedVoxelizedStructure.py> from the Prop3D GitHub repo <https://github.com/bouralab/Prop3D>`
import os
import numpy as np
from Prop3D.common.DistributedStructure import DistributedStructure
from Prop3D.common.DistributedVoxelizedStructure import DistributedVoxelizedStructure
#Load structure and perform actions in atomic coordinate space
structure = DistributedStructure(
os.environ["PROP3D_DATA"],
key="2/30/30/100",
cath_domain_dataset="1kq2A00")
#Save 5 random rotation sampling from the SO(3) group to pdb files
for i, (r, M) in enumerate(structure.rotate(num=5)):
structure.save_pdb(f"1kq2A00_rotation{i}.pdb")
#Load structure and perform actions in voxelized coordinate space
structure = DistributedVoxelizedStructure(
os.environ["PROP3D_DATA"],
key="2/30/30/100",
cath_domain_dataset="1kq2A00")
#Save 5 random rotations sampling from the SO(3) group to numpy files
for i, (r, M) in enumerate(structure.rotate(num=5)):
coords, feats = structure.map_atoms_to_voxel_space(autoencoder=True)
np.savez(f"1kq2A00_rotation{i}.npz", coords=coords, feats=feats)