The order in which our biostatistician is listed as author depends on the extent of contribution:
First author: The biostatistician plays a major role in project development or hypothesis generation, (when existing data) substantially changes direction of the hypothesis and corresponding analysis, provides a unique contribution, conducts analysis and writes a large portion of your paper.
Second author: The biostatistician plays a major role in project development, offers statistical expertise needed for your research, or conducts substantial and/or complex analysis of data. This is a typical position for a biostatistician involved in the life cycle of a project or who has unique analytic expertise (e.g., predictive modeling, causal inference, genomics, etc.).
Middle-listed author: The biostatistician does not necessarily collaborate in the design of the study, but conducts data analysis, writes the statistical methods section and assists in interpreting results for conclusion.
Senior author: The biostatistician provides significant mentoring to the first author, offering guidance for the first author to conduct analysis, or plays a major role in helping the first author design the study. There generally has been substantial one-on-one meeting time as well as additional analysis and paper writing support beyond a typical collaborative role.
Sometimes a secondary biostatistician will be involved in the project to provide additional support, they should immediately follow the primary statistician within the authorship line (e.g., if the primary statistician is the second author, the secondary statistician would generally be listed as the third author)
Project costs subsidized by the Cancer Center are required to submit publications to PubMed Central and cite the Cancer Center grant (P30CA046934) in posters and publications. If applicable, any grants or other support related to the project, including the Colorado Clinical & Translational Sciences Institute and the REDCap project, should be included.
Grants: For an efficient proposal development process please get us involved as early as possible. Expect multiple interactions with the statistician during the development process. Developed proposals (e.g., clear aims, hypotheses, and endpoints) must be provided to the statistician at least 1 month prior to its submission date.
Data analysis: Allow a minimum of 1 month for analysis (depending on complexity) and write-up/presentation preparation AFTER statistician receives the completed and cleaned dataset.
Data Cleaning and Management
Analysis datasets must be cleaned and in a format ready for analysis. All datasets should be provided as either REDCap databases or CSV (comma-separated values) files. When available, raw data used to generate calculated analysis variables should
also be provided.
Is REDCap an option? If yes, use an existing project as a starting template
Note: Data entered n REDCap is the preferred method as it meets HIPPA regulations and is version controlled.
If REDCap is not an option, and the data contains PHI, please work with a statistician to develop a plan to transfer data in a HIPPA compliant manor.
Review with statistician and ask for help .
Leave excluded patients in and reason for excluding, with indicator variable to identify as excluded patient.
Include all essential dates for variables and follow ups.
Datasets sent to the statistician just a few days before an abstract deadline will not be analyzed.
Allow for at least 1 month for analysis and write-up/presentation preparation AFTER statistician receives the completed and cleaned dataset.
Remove PHI (If applicable).
Review outliers (do not remove from the dataset).
Review missing observations (make sure they are represented the same way throughout the dataset).
Keep variable names short and meaningful and remove spaces from variable names
Do not start variable names with a number or special character.
Do not mix character/class and numeric values in the same variable – define as text only if absolutely required.
Be sure categorical variables have the correct number and spelling of categories (i.e. Male, Female. NOT: male, M, female, Female, F).
Define numeric variables as numeric, not text, and include range checks to prevent invalid data entry.
Remove merged cells and don’t include multiple observations or attributes in a single variable.
Limit text fields to variables not required in the analysis as they can only be used in descriptive reports.
Data on spreadsheets must be free of color/highlighting and clinical notes – use indicator variables to define groups.
Data dictionary must be present, understandable, and include a coding guide for numeric variables.
The Biostatistics and Bioinformatics Shared Resource offers an educational series throughout the year. These seminars are typically held on the third Thursday of the month from 1-2 pm in RC2 North in room 6107.
Examples of past seminar topics:
How to work efficiently and effectively with a statistician