University of Pittsburgh
Research Informatics Office

FAQ

Please click on a question to find the answer.

What is R3?
Why and How did R3 get started?
Where do I sign up for R3?
How do I cite R3 in my publications?
What codes should I use for my study? Why are codes important?
Why do I have to use both ICD 9 and ICD 10 codes for my project?
How do I handle the recruitment process?
What is CTSI and how can they help me?
How can DBMI help me with my overall research?
What is a QI/QA project?
How long does it take to get my data?
How does UPMC factor into R3?
What is an IRB protocol and why do I need one?
What is an Honest Broker? Do I need one?
How do I get images through R3?
What kinds of data can R3 pull?
What is discrete data?
What is Globus?
How do I get a Globus account?
What is Neptune?
How is R3 able to provide this data?
How much will R3 cost?
How does the R3 process work?
What is an Epic Build?
How does ID matching work?
What happened to CARe?
What is a specification?
What is an attestation and why do I have to sign one in order to get my data?

What is R3?
R3 serves several key functions: 1) as a Health Sciences core facility to deliver data with charge-back to investigators like other cores; 2) operating on behalf of UPMC to release data according to approved IRB protocols; and 3) operating on behalf of UPMC and Pitt to acquire attestations from investigators in regard to their responsibilities with clinical data.

R3 is designed to help UPMC and the University of Pittsburgh researchers gather EMR data in one convenient location by using the Neptune Database. The database houses Epic, Mars, and Cerner data that is managed by R3 and provisioned as either de-identified or identified data. Data includes EMR discrete data, text, and DICOM images. R3 improves efficiency, cost, and quality for researchers gathering EMR data for research. R3 acts as a guide to help researchers identify the best data for their projects, without going directly to multiple UPMC system owners (Cerner, MARS, EPIC, Cancer Registry, etc.).

Why and How did R3 get started?
R3 was started by the Chief Research Informatics Officer (CRIO), Jonathan C. Silverstein, MD, MS, FACS, FACMI. The CRIO is responsible for coordinating the healthcare data, infrastructure, and services for the Department of Biomedical Informatics (DBMI), the University of Pittsburgh Health Sciences, and the Institute for Precision Medicine (http://ipm.pitt.edu). DBMI and Dr. Silverstein’s team specifically, addressed the problem of UPMC and Pitt not having a scalable honest brokerage, at low cost, and high quality provision all UPMC clinical data for research. The team, after initial designs by Shyam Visweswaran, created Neptune, a database, where EMR discrete data is ingested, organized, and housed in one location to serve researchers. This was further extended with EMR text and then with DICOM image provisioning in collaboration with UPMC Enterprises. Expanding services to include claims data from the Health Plan is under consideration.

Where do I sign up for R3?
https://www.rio.pitt.edu/services

How do I cite R3 in my publications?
Please remember to acknowledge the Clinical and Translational Science Institute (CTSI) in all publications that received services through R3. The CTSI provides the following citation for your convenience: The project described was supported by the National Institutes of Health through Grant Number UL1TR001857. Please also reference our Neptune paper: Visweswaran S, McLay B, Cappella N, Morris M, Milnes JT, Reis SE, Silverstein JC, Becich MJ. An atomic approach to the design and implementation of a research data warehouse. J Am Med Inform Assoc. Cold Spring Harbor Laboratory Press; 2022 Mar 15;29(4):601-608. PMCID: PMC8922189

What codes should I use for my study? Why are codes important?
Codes help the R3 team search for data in a convenient and easy way. The use of one code can also be used across multiple EMR systems when researchers are getting data from a variety of sources.
Types of codes R3 regularly uses:
ICD: used for diagnoses (ICD 10 codes started after October 1, 2015)
CPT: used for procedures
LOINC: used for lab tests and results
RxNorm, NDC: used for medications
Other: various other codes may be used, but typically our most important are ICD, CPT, and LOINC

Why do I have to use both ICD 9 and ICD 10 codes for my project?
Any research that is done before October 1, 2015 used ICD 9 codes. ICD 10 codes were used from October 1, 2015 to the present. Therefore, any data that spans before 2015 and after needs both ICD 9 and 10 codes.

How do I handle the recruitment process?
If you are recruiting patients at the Children’s Institute of Pittsburgh, then you can use PedsNet to help with recruitment. If you are recruiting patients from elsewhere in the UPMC EMR, then you can get some help from CTSI (https://ctsi.pitt.edu/). R3 can help develop lists of patients based on EMR data, when IRB authorizes investigators to do so, but R3 doesn’t directly engage with the practices.

What is CTSI and how can they help me?
https://ctsi.pitt.edu/research-services/

CTSI works directly with researchers in collaboration with regulatory agencies to navigate all necessary regulatory pathways, at any stage of their research. They offer help with regulatory resources, community outreach, recruitment, research facilities and networks, and more.

How can DBMI help me with my overall research?
https://www.dbmi.pitt.edu/

The Department of Biomedical Informatics (DBMI) at the University of Pittsburgh School of Medicine has an annual research budget of over $15M. DBMI’s current clinical and healthcare data portfolio includes the NIH-funded Precision Medicine Initiative, a CTSA Informatics Component with a research data repository, a PCORnet informatics hub site and data repository, an informatics and data hub for the NCATS-funded Accrual of patients to Clinical Trials (ACT), an NIH Common Fund infrastructure award for the Human BioMolecular Atlas Program (HuBMAP), and a CDC-NIOSH funded National Mesothelioma Virtual Bank. DBMI is also home to an active NLM-funded Biomedical Informatics Training program that has over 30 trainees. With all of these different resources available to researchers, and engaged directly with Dr. Silverstein’s RIO team, Dr. Becich, and Dr. Visweswaran, and others, DBMI can help researchers with their projects as collaborative informatics investigators, beyond the work of R3.

What is a QI/QA project?
QI/QA projects are quality improvement projects that are a systematic, formal approach to analysis of practice performance and efforts to improve performance. R3 does NOT handle these requests. Instead, please visit: https://infonet.upmc.com/ClinicalTools/QualityImprovement/
*Note: The website only works with UPMC signon.

How long does it take to get my data?
It varies because the number of steps varies tremendously by project. For example, if the project involves DICOM data that has not been previously extracted from archive, or if we have to engage other UPMC teams to acquire new data sources, this will create additional steps. As the requestor, you are in charge of getting back to the R3 team in a timely fashion with any questions we may have about your project to keep steps moving forward. While we strive to get back to you as quickly as possible, we are working on multiple projects at once, and in order to be efficient overall we track all responses within two weeks. We ask that you be patient with us and think ahead of time about the data you’ll need for your project. Investigators who are slow to respond to us while demanding immediate response from us, do not ingratiate us. When we ask for things from the investigator and when the investigator asks things from us, we prefer responses within two weeks in both directions with the status at least.

How does UPMC factor into R3?
R3, a University of Pittsburgh service core, works under a HIPAA Business Associates Agreement on behalf of UPMC, specifically in coordination with the UPMC Chief Medical Information Officer.

What is an IRB protocol and why do I need one?
https://www.hrpo.pitt.edu/getting-started

What is an Honest Broker? Do I need one?
An Honest Broker is someone who can de-identify patient information for the purpose of research under an approved IRB protocol or determination of non-human subjects research. R3 offers Honest Broker services to investigators. R3 also provisions identified data under IRB approval. If you need an Honest Broker assurance form for your IRB protocol, please contact healthr3@pitt.edu

How do I get images through R3?
R3’s routine intake processes now include, in collaboration with UPMC Enterprises, specification, provisioning and de-identification (when required) of clinical DICOM data from the Petabytes of data among UPMC’s vaults.

What kinds of data can R3 pull?
R3 can pull all kinds of EMR data, including discrete data, DICOM images, and text. R3 can also help you source research data from Cerner, EPIC, ARIA, and various other UPMC resources and databases. We routinely ingest and index discrete data from 2004 to approximately 35 days before the current date in monthly loads. We routinely ingest and index text data from 2011 to approximately 45 days before the current date in monthly loads. DICOM data is extracted from UPMC vaults on demand.

What is discrete data?
Discrete data is nominal or numeric data (not text blocks) that is entered into the EHR. These typically include the following (but there are thousands of discrete data types):
HIPAA Identifiers, demographics
Encounters: outpatient, ED, inpatient
Diagnoses: billing, encounter based, problem list
Procedures: billing
Medications: orders/prescribe, dispense
Laboratory tests: orders, results
Social history: tobacco, alcohol
Vitals, allergies
Metadata for notes

What is Globus?
Globus is the HIPAA/BAA subscription distribution system that R3 uses to deliver files to researchers. In order to gain access to your identified or de-identified data, you will obtain a Globus account using your Pitt or UPMC login. R3 may also have you use Globus to upload identified data to us for R3 processing or review.

How do I get a Globus account?
To give you access to the R3 Globus data repository, please create a Globus user ID by navigating to http://app.globus.org

On the login screen, use your University of Pittsburgh or UPMC (choose this in the in the dropdown) login credentials to log in. After logging in, email us the email address displayed in 1) Account > 2) Identities > 3) Identity as shown below.

What is Neptune?
The Neptune Research Data Warehouse consists of database systems and other affiliated systems among UPMC and Pitt networks. On a monthly basis, we compile data from electronic health record systems across UPMC. Data from Epic, MARS, Cerner, Children’s Hospital (Cerner), other systems, and project-specific data sources are extracted. Identifiers are retained on Neptune’s UPMC side, whereas the clinical data itself, without patient identifiers, is managed on the Pitt side. This separation, a design conceived by Shyam Visweswaran, DBMI faculty, creates high efficiency of operation and security without compromise. We extract data at least one month after the actual clinical events to reduce the need to update any data, improving efficiency. Thus, Neptune lags real time by approximately 45 days. On the Pitt side, without identifiers, but enabling re-identification on the UPMC side, when authorized, data is structured with reference terminologies and value sets to improve its usability for research.

How is R3 able to provide this data?
We operate under a Business Associates Agreement, or BAA, between the Research Informatics Office and UPMC, which means we work on behalf of UPMC, though we are Pitt investigators and staff. In this way, R3 functions under an IRB Honest Broker Certification. R3 also stewards the implementation of research workflows in clinical systems, and research recruitment based upon clinical data.

How much will R3 cost?
As a charge-back core facility of Health Sciences, we serve UPMC and Pitt investigators and are not permitted to provide “free” service to any group other than Preparatory to Research (PTR) counts of eligible patients. R3 Services are of five types at this time: PTR, Small, Medium, Large, and working as informatics scientists on collaborative projects. PTR aggregate counts of patients are free and often directly specified in our intake meeting with investigators, and sometimes delivered immediately afterward. Small are $500, Medium $1000, and Large $2000. The size of request is derived from the complexity to deliver on the request. For example a very simple request, such as detail needed on a list of patients in a clinical trial may be small even if it is identified data, whereas a very complex extraction of data, even if entirely retrospective, from Neptune, may induce a large charge. We do not bill by the volume of data or patients, but rather for the effort required on a project basis, so the count of patients is not a billing issue. However, of course, very large extractions may raise other concerns necessitating further approvals or review by UPMC. Effort by UPMC Information Services Division may require special arrangements and billing. When we have to go to new sources of data or process text or images, typically a large project size is asserted. Note also, for example, a typical R01 proposal that focuses upon retrospective data use typically is structured as one large project, $2000 charge, per grant year.

How does the R3 process work?
This diagram represents the R3 workflow. It is divided into four major components each which require engagement between investigators and R3. First is intake via our online request form linked from the services page of rio.pitt.edu. Then we start by discussing feasibility via a teleconference or in person directly with the investigator team. It is far more effective if the PI attends this intense 30 minute meeting that sets everything in motion. Feasibility and budgeting, often over email, are done in parallel with investigators to ensure the IRB protocol is compatible with the request, its feasible and a budget is determined. We may need to iterate in order to scope and budget the request, and may enlist support from UPMC Information Services Division, in some cases where our own system access or expertise is insufficient. In the regulatory review step we collect an attestation from the PI in regard to their responsibilities and we assemble a work description, cost, and account number which will be billed. In this step we clarify investigators’ use of data, and responsibilities to UPMC and CTSI, for example, providing back to CTSI/R3 all published papers that use the data. Then we deliver, honest broker when required, and do the cost transfers.

What is an Epic Build?
Epic build is the name we use to describe any configuration of the EMR system specifically done to support research workflows. For example, a system to collect specific research data in the EMR during clinical care would require an Epic Build. R3 stewards the process of the investigator specifying these, and working with the Epic team of UPMC to implement them and any charge-back costs of the work.

How does ID matching work?
When needed, R3 can match clinical identifiers collected from patients or the EMR in IRB approved research projects to the other records we hold. When we do this we need to know what type of identifier it is we are being given (securely via Globus) to match. For example, are these Medical Record Numbers from Epic, from the Medipac billing system, or from DICOM headers, etc…? Why? Identifiers of different types over many years at UPMC may “collide” with the same number representing different patients in different systems. We can use other information to improve matches, but this is less reliable and so we request what type of identifier each identifier is.

What happened to CARe?
CARe was a system of UPMC that is no longer operational. It helped researchers at UPMC and the University of Pittsburgh get data and has been replaced by R3. Any new updates of data that you received previously via CARe will need to go through the R3 process for any continuation. All new projects that need data must go through R3.

What is a specification?
A specification is a document that both the researcher and the R3 team agrees to adhere to when assembling your data for you. An example of the beginning of the specification is below:

What is an attestation and why do I have to sign one in order to get my data?
An attestation is an informal assertion we require to document our work, not a legal contract. It demonstrates for everyone that we are “on the same page”, literally, between the researcher and the R3 team. The researcher is responsible for sending a 32-digit Pitt account number for billing, keeping the data secure per IRB requirements, and adhering to other rules and regulations set forth by the IRB, CTSI, and UPMC. R3 is responsible for securely delivering the data described on the specification.