The Stanford PHS supports four service components that facilitate storage and usage of high-risk datasets: (1) the PHS Data Portal, (2) Nero GCP, (3) Carina, and (4) the Virtual Windows Cloud S-VDI. These computing environments have a variety of widely-used programs (including R, SAS, Stata, Python, SQL, and others). An overview the PHS SDE and its components is available here.
The image below illustrates the interconnection between the service components that make up the PHS computing environment. The Stanford Information Security Office (ISO) has approved the use of the software to support high-risk data storage and analysis. The Stanford PHS, the ☞ Stanford SRC , and Stanford CTSC maintain the operations of the machines, and provide support for researchers and partners who are using these systems.
Name | Description | Languages Available | Costs |
---|---|---|---|
PHS Data Portal (Includes the Redivis Notebooks) | A data catalog to find and explore PHS-related datasets, cut analytic cohorts, and analyze high-risk data. |
| Free for most uses. You can pre-buy extra computational resources if needed. Costs are transparent and predictable, and are roughly equivalent to GCP costs. |
Nero GCP | A flexible and powerful cloud-based computing environment for high-risk data. |
| These machines are in Google Cloud, so there are cloud costs and setup costs, but they are not prohibitive. Approximately $100- 400/month for a typical machine. You can access Google’s price calculator here (though there’s a Stanford discount that is not reflected in the calculator). |
Carina | A secure, linux-based, on-premises computing environment for high-risk data. |
| 1TB of data storage is free. Extra storage can be purchased (either a new node, or ~$94.65/month for 10TB). |
Virtual Windows Cloud S-VDI | A powerful and flexible cloud-based Azure computing environment structured in the Windows format that can handle high-risk data. |
| These machines are in a Cloud environment so there are cloud costs and setup costs, but they are not prohibitive. You can request an individual virtual machine at $80 per user per month (4 CPU, 16GB RAM, 120GB storage; costs increase linearly for higher-spec machines, and there is cost for STATA software). More detailed pricing information available here. |
Note about Security: Data access control is less monitored on Nero, Carina, and S-VDI as compared to the PHS Data Portal.
This means that we rely on each researcher to be responsible and only share data with other PHS-approved lab members. Please check that such lab members have completed the data access requirements (and had them approved), and that heir access rights are still currently valid before transferring any data.
Failure to follow the ☞ Data Use Agreement will lead to disciplinary actions.