What I Have Learned from Stat 5013 in This Semester
Essay by duotingle • February 6, 2017 • Essay • 719 Words (3 Pages) • 1,224 Views
What I have learned from STAT 5013 in this semester
The explosive growth of data volume and development of hardware and storage technology allows large amounts of data has become a potential infinite wealth. All walks of life say that they are engaged in big data. Computer science, information technology, applied mathematics, computational mathematics, operations research, industrial engineering, electronic engineering and even the political sphere there are people who start using big data and have claimed to be in big data. It seems that we can not hear the voice from statistics.
From this class, I know about that Big data does not mean comprehensive, accurate and truth. However, the statistical has a vital role on value and vitality of the Big Data. The statistic seems like to construct and solve the "undefined" problem. Statisticians like well-structured data and clear statistical issues. Big data has brought many opportunities, but these statistics do not seem in the framework of the "traditional standards." Statistical typically use sample data. By estimates of sample data to estimate the overall variables. Of course, there are some traditional statistics such as t-test, chi-square test, ANOVA.
I would like to make example to explain what I learned from this course. ANOVA can be seen as a special kind of linear model. when covariate is factor (such as gender, treatment, control), Statisticians found that the linear model has a simpler and easier form and the variance decomposition. In the linear model when you see the table does not necessarily on ANOVA table, the table is talking about the significance of each covariate. So if a covariate is the factor, so each row is a level of a factor.
In fact, ANOVA (Analysis of variance) is a common statistical data analysis model. ANOVA rely on F- distribution probability distribution to estimated the value. It uses of the sum of square and the degree of freedom to calculated Mean of square. If there is a significant difference, we were considered the hypotheses which also known as multiple comparisons. More common ways are Scheffé's method, Tukey-Kramer method with Bonferroni correction which are to investigate why the difference between the groups it. The simply way to understand is that it is to use the resulting data into the test statistic, get the value of statistics, and compared with standard values or the calculated p-value and to determine the original hypothesis or the alternative hypothesis.
Next, P-value is the probability that when the null hypothesis is true observation of the obtained results appear more extreme. Simply, it is the assumption that the null hypothesis is true calculated in a range of probabilities. Then α error, it means that when the null hypothesis is true that mistakes we make when we reject the null hypothesis. For example, the null hypothesis is now a product of the average weight =100. We will use a sample from those products and calculate the corresponding P-value is 0.01 which means that the probability of average weight less than 100 is 0.01, which is a small probability event. According to the theory, an event in a single experiment can not happen, but if it happens, we have reason to believe that this event is true. So we think that the average weight of the product is not equal to 100, and at this time we can not wait to be able to reject the null hypothesis at once, but still a little reckless, where there is a risk of 0.01. When this risk is less than the risk we can afford artificial given that α, we make judgment that reject the null hypothesis.
...
...