Statistic Corner
Common data manipulations with R in biological researches
Abstract
is a computer language and has been widely used in science community due to the powerful capability in data analysis and visualization; and these functions are mainly provided by the developed packages. Because every package has strict format definitions on the inputted data, it is always required to appropriately manipulate the original data in advance. Unfortunately, users, especially for the beginners, are always confused by the extreme flexibility with R in data manipulation. In the present paper, we roughly categorize the common manipulations with R for biological data into four classes, including overview of data, transformation, summarization, and reshaping. Subsequently, these manipulations are exemplified in a sample data of clinical records of diabetic patients. Our main purpose is to provide a better landscape on the data manipulation with R and hence facilitate the practical applications in biological researches.