Friday, October 21, 2011

EXPERIMENTAL RESEARCH

Experimental research has had a long tradition in psychology and education. The experimental method formally surfaced in educational psychology around the turn of the century, with the classic studies by Thorndike and Woodworth on transfer.

ANOVA on residual scores

Residual scores represent the difference between observed posttest scores and their predicted values from a simple regression using the pretest scores as a predictor. An attractive characteristic of residual scores is that,unlike gain scores, they do not correlate with the observed pretest scores

Thursday, October 20, 2011

The T-Test

The T-Test

This analysis is appropriate whenever you want to compare the means of two groups, and especially appropriate as the analysis for the posttest-only two-group randomized experimental design. The t-test assesses whether the means of two groups are statistically different from each other.

The t-value will be positive if the first mean is larger than the second and negative if it is smaller. Once you compute the t-value you have to look it up in a table of significance to test whether the ratio is large enough to say that the difference between the groups is not likely to have been a chance finding. To test the significance, you need to set a risk level (called the alpha level). In most social research, the "rule of thumb" is to set the alpha level at .05. This means that five times out of a hundred you would find a statistically significant difference between the means even if there was none (i.e., by "chance"). You also need to determine the degrees of freedom (df) for the test. In the t-test, the degrees of freedom is the sum of the persons in both groups minus 2. Given the alpha level, the df, and the t-value, you can look the t-value up in a standard table of significance (available as an appendix in the back of most statistics texts) to determine whether the t-value is large enough to be significant. If it is, you can conclude that the difference between the means for the two groups is different (even given the variability).

Wednesday, October 19, 2011

ANCOVA : Covariate

In econometrics, the term "control variable" is usually used instead of "covariate". In a more specific usage, a covariate is a secondary variable that can affect the relationship between the dependent variable and other independent variables of primary interest. In statistics, a covariate is a variable that is possibly predictive of the outcome under study. A covariate may be of direct interest or it may be a confounding or interacting variable. The alternative terms explanatory variable, independent variable, or predictor, are used in a regression analysis.

An example is provided by the analysis of trend in sea-level by Woodworth (1987). Here the dependent variable (and variable of most interest) was the annual mean sea level at a given location for which a series of yearly values were available. The primary independent variable was "time". Use was made of a "covariate" consisting of yearly values of annual mean atmospheric pressure at sea level. The results showed that inclusion of the covariate allowed improved estimates of the trend against time to be obtained, compared to analyses which omitted the covariate.

ANCOVA Assumptions

ANCOVA Assumptions

In statistics, analysis of covariance (ANCOVA) is a general linear model with a continuous outcome variable (quantitative, scaled) and two or more predictor variables where at least one is continuous (quantitative, scaled) and at least one is categorical (nominal, non-scaled). ANCOVA is a merger of ANOVA and regression for continuous variables. ANCOVA tests whether certain factors have an effect on the outcome variable after removing the variance for which quantitative predictors (covariates) account. The inclusion of covariates can increase statistical power because it accounts for some of the variability.

Like any statistical procedure, the interpretation of ANCOVA depends on certain assumptions about the data entered into the model. For instance, the F-test assumes that the errors are normally distributed and homoscedastic. Since ANCOVA is a method based on linear regression, the relationship of the dependent variable to he independent variable(s) must be linear in the parameters.Simplifying assumption (not necessary to run ANCOVA): homogeneity of regression which says that the relationship between the covariate and the dependent variable should be similar across all groups of the independent variable.

In ANCOVA, assumptions include at least one categorical independent variable and at least one interval or metric independent variable. The categorical independent variable is called a factor, whereas the metric independent variable is called a covariate.

In ANCOVA assumptions, the most common use of covariate is to remove extraneous variations from the dependent variable. This is because in ANCOVA assumptions, the effect of factors is of major concern.

Like ANOVA, ANCOVA assumptions have similar assumptions. These assumptions are as follows:

The variance that is being analyzed or estimated should be independent, which also holds true for ANCOVA assumptions.

In ANOVA, the variable which is dependent in nature must have the same variance in each category of the independent variable. In the case of more than one independent variable, the variance must be homogeneous in nature, within each cells formed by the independent categorical variables, which also holds true for ANCOVA assumptions.

In ANOVA, it is assumed that the data upon which the significance test is conducted is obtained by random sampling, which also holds true for ANCOVA assumptions.

When analysis of variance is conducted on two or more factors, interactions can arise. An interaction occurs when the effect of independent variables on a dependent variable is different for different categories, or levels of another independent variable. If the interaction is significant, then the interaction may be ordinal or disordinal. Disordinal interaction may be of a no crossover or crossover type. In the case of the balanced designs, while conducting ANCOVA assumptions, the relative importance of factors in explaining the variation in the dependent variable is measured by omega squared. Multiple comparisons in the form of a priori or a posteriori contrast can be used for examining differences among specific means in ANCOVA assumptions.

In ANCOVA assumptions, the adjusted treatment means the computed or the estimated are based on the fact that the variable by covariate interaction is negligible. If this ANCOVA assumption is violated, then the adjustment of the response variable to a common value of the covariate will be misleading.So in ANCOVA assumptions, the relationship between the independent and dependent variable must be linear in the parameters. Thus, in ANCOVA assumptions, the different levels of the independent variable will follow normal distribution with mean zero. ANCOVA assumptions combine with the assumption of linear regression. The method of ANCOVA assumptions is done by using a linear regression.

ANCOVA assumptions also assume the homogeneity of regression coefficients which is based on the fact that the regression coefficient for every group present in the data of the independent variable should be same. If this fact of ANCOVA assumptions is violated, then the ANCOVA assumption will be misleading.

statistics solutions

statistics solutions

I provide statistical solutions for the students, lecturer, professors in completion of their dissertations, Ph.D., projects specially in Education and Psychology.

My Education: Ph.D. Economics,

Ph.D. Education

Current Post: Asst. Professor of Economics

So if you want statistics solutions can contact at anu0562@gmail.com

Wednesday, October 12, 2011

ANOVA : Analysis of Variance

In statistics, analysis of variance (ANOVA) is a collection of statistical models, and their associated procedures, in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form ANOVA provides a statistical test of whether or not the means of several groups are all equal, and therefore generalizes t-test to more than two groups. Doing multiple two-sample t-tests would result in an increased chance of committing a type I error. For this reason, ANOVAs are useful in comparing two, three or more means.

Tuesday, October 11, 2011

Statistical Analysis of Data in Education and Psychology Research

Statistical Analysis of Data in Education and Psychology Research: Revealing Facts From Data

Statistics

Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments.

Population

In applying statistics to a scientific, industrial, or societal problem, it is necessary to begin with a population or process to be studied. Populations can be diverse topics such as "all persons living in a country" or "every atom composing a crystal". A population can also be composed of observations of a process at various times, with the data from each observation serving as a different member of the overall group. Data collected about this kind of "population" constitutes what is called a time series.

Sample

For practical reasons, a chosen subset of the population called a sample is studied — as opposed to compiling data about the entire group (an operation called census). Once a sample that is representative of the population is determined, data are collected for the sample members in an observational or experimental setting. These data can then be subjected to statistical analysis, serving two related purposes: description and inference.

Descriptive statistics

Descriptive statistics summarize the population data by describing what was observed in the sample numerically or graphically. Numerical descriptors include mean and standard deviation for continuous data types (like heights or weights), while frequency and percentage are more useful in terms of describing categorical data (like race).

Inferential statistics

Inferential statistics uses patterns in the sample data to draw inferences about the population represented, accounting for randomness. These inferences may take the form of: answering yes/no questions about the data (hypothesis testing), estimating numerical characteristics of the data (estimation), describing associations within the data (correlation) and modeling relationships within the data (for example, using regression analysis). Inference can extend to forecasting, prediction and estimation of unobserved values either in or associated with the population being studied; it can include extrapolation and interpolation of time series or spatial data, and can also include data mining.

Statistical methods

Experimental and observational studies

A common goal for a statistical research project is to investigate causality, and in particular to draw a conclusion on the effect of changes in the values of predictors or independent variables on dependent variables or response. There are two major types of causal statistical studies: experimental studies and observational studies. In both types of studies, the effect of differences of an independent variable (or variables) on the behavior of the dependent variable are observed. The difference between the two types lies in how the study is actually conducted. Each can be very effective. An experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements. In contrast, an observational study does not involve experimental manipulation. Instead, data are gathered and correlations between predictors and response are investigated.

Experiments

The basic steps of a statistical experiment are:

1. Planning the research, including finding the number of replicates of the study, using the following information: preliminary estimates regarding the size of treatment effects, alternative hypotheses, and the estimated experimental variability. Consideration of the selection of experimental subjects and the ethics of research is necessary. Statisticians recommend that experiments compare (at least) one new treatment with a standard treatment or control, to allow an unbiased estimate of the difference in treatment effects.

2. Design of experiments, using blocking to reduce the influence of confounding variables, and randomized assignment of treatments to subjects to allow unbiased estimates of treatment effects and experimental error. At this stage, the experimenters and statisticians write the experimental protocol that shall guide the performance of the experiment and that specifies the primary analysis of the experimental data.

3. Performing the experiment following the experimental protocol and analyzing the data following the experimental protocol.

4. Further examining the data set in secondary analyses, to suggest new hypotheses for future study.

5. Documenting and presenting the results of the study.

Observational study

An example of an observational study is one that explores the correlation between smoking and lung cancer. This type of study typically uses a survey to collect observations about the area of interest and then performs statistical analysis. In this case, the researchers would collect observations of both smokers and non-smokers, perhaps through a case-control study, and then look for the number of cases of lung cancer in each group.

Levels of measurement

There are four main levels of measurement used in statistics: nominal, ordinal, interval, and ratio. Each of these have different degrees of usefulness in statistical research. Ratio measurements have both a meaningful zero value and the distances between different measurements defined; they provide the greatest flexibility in statistical methods that can be used for analyzing the data. Interval measurements have meaningful distances between measurements defined, but the zero value is arbitrary (as in the case withlongitude and temperature measurements in Celsius or Fahrenheit). Ordinal measurements have imprecise differences between consecutive values, but have a meaningful order to those values. Nominal measurements have no meaningful rank order among values.

Key terms used in statistics

Null hypothesis

Interpretation of statistical information can often involve the development of a null hypothesis in that the assumption is that whatever is proposed as a cause has no effect on the variable being measured.

The best illustration for a novice is the predicament encountered by a jury trial. The null hypothesis, H₀, asserts that the defendant is innocent, whereas the alternative hypothesis, H₁, asserts that the defendant is guilty. The indictment comes because of suspicion of the guilt. The H₀ (status quo) stands in opposition to H₁ and is maintained unless H₁ is supported by evidence"beyond a reasonable doubt". However,"failure to reject H₀" in this case does not imply innocence, but merely that the evidence was insufficient to convict. So the jury does not necessarily accept H₀ but fails to reject H₀. While one can not "prove" a null hypothesis one can test how close it is to being true with a power test, which tests for type II errors.

Error

Working from a null hypothesis two basic forms of error are recognized:

§ Type I errors where the null hypothesis is falsely rejected giving a "false positive".

§ Type II errors where the null hypothesis fails to be rejected and an actual difference between populations is missed giving a "false negative".

Error also refers to the extent to which individual observations in a sample differ from a central value, such as the sample or population mean. Many statistical methods seek to minimize the mean-squared error, and these are called "methods of least squares."

Measurement processes that generate statistical data are also subject to error. Many of these errors are classified as random (noise) or systematic (bias), but other important types of errors (e.g., blunder, such as when an analyst reports incorrect units) can also be important.

Interval estimation

Most studies will only sample part of a population and so the results are not fully representative of the whole population. Any estimates obtained from the sample only approximate the population value. Confidence intervals allow statisticians to express how closely the sample estimate matches the true value in the whole population. Often they are expressed as 95% confidence intervals. Formally, a 95% confidence interval for a value is a range where, if the sampling and analysis were repeated under the same conditions (yielding a different dataset), the interval would include the true (population) value 95% of the time.

Significance

Statistics rarely give a simple Yes/No type answer to the question asked of them. Interpretation often comes down to the level of statistical significance applied to the numbers and often refer to the probability of a value accurately rejecting the null hypothesis.

Student's t-test

A t-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported. It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known. When the scaling term is unknown and is replaced by an estimate based on the data, the test statistic (under certain conditions) follows a Student's t distribution.

Students, Lecturers, Professors can contact here for statistical analysis of their research data computer with the help of statistical software.

It can be done within a very short time.

http://en.wikipedia.org/wiki/Statistics