Employee Performance Analysis

Project Outline:

This research is about the Employee performance in an organization. Data related to several factors such as Employee Productivity, Customer Satisfactions Scores, Accuracy Scores, Experience and Age of Employees is taken into consideration. Statistical methods are used to identify if there is any impact of Age and Experience of Employees on factors such as Productivity, Customer Satisfaction and Accuracy.

Theoretical Framework:

XYZ Corporation operating out of Illinois, US want to find out if the age and experience of employees have an impact on his/her performance. They have hired an external consultant to study the impact of these two factors (age and experience) on the performance metrics of the employees. According to the results of the research conducted by this external consultant, XYZ Corporate will design a strategy of recruiting the right talent which will have maximum performance.

Design and Methodology:

Design and Methodology used by the external consultant include identifying the various performance factors common across different businesses within XYZ Corporation. The performance measures common for all businesses included:

Customer Satisfaction Scores
Accuracy Scores
Productivity

The consultants decided to study the impact of age of employees and their experience on the above factors by using statistical methods.

Details on participants and sampling methods:

Sampling Methods:

Sampling is the process of selecting a small number of elements from a larger defined target group of elements. Population is the total group of elements we want to study. Sample is the subgroup of the population we actually study. Sample would mean a group of ‘n’ employees chosen randomly from organization of population ‘N’. Sampling is done in situations like:

We sample when the process involves destructive testing, e.g. taste tests, car crash tests, etc.
We sample when there are constraints of time and costs
We sample when the populations cannot be easily captured

Sampling is NOT done in situations like:

We cannot sample when the events and products are unique and cannot be replicable

Sampling can be done by using several methods including: Simple random sampling, Stratified random sampling, Systematic sampling and Cluster sampling. These are Probability Sampling Methods. Sampling can also be done using methods such as Convenience sampling, Judgment sampling, Quota sampling and Snowball sampling. These are non-probability methods of sampling.

Simple random sampling is a method of sampling in which every unit has equal chance of being selected. Stratified random sampling is a method of sampling in which stratum/groups are created and then units are picked randomly. Systematic sampling is a method of sampling in which every nth unit is selected from the population. Cluster sampling is a method of sampling in which clusters are sampled every tth time.

For the non-probability methods, Convenience sampling relies upon convenience and access. Judgment sampling relies upon belief that participants fit characteristics. Quota sampling emphasizes representation of specific characteristics. Snowball sampling relies upon respondent referrals of others with like characteristics.

In our research, the consultant organization used a Simple Random Sampling method to conduct the study where they chose about 75 random employees and gathered data of age, experience, their Customer Satisfaction scores, their Accuracy Scores and their Productivity scores.

The employees were bifurcated into 3 age groups, namely, 20 – 30 years, 30 – 40 years and 40 – 50 years. Similarly, they were also bifurcated into 3 experience groups, namely, 0 – 10 years, 10 – 20 years and 20 – 30 years.

Data Analysis:

Below are the different data analysis options used by the consultant:

Impact of Age on Accuracy
Impact of Experience on Accuracy
Impact of Age on Customer Satisfaction
Impact of Experience on Customer Satisfaction
Impact of Age on Productivity
Impact of Experience on Productivity

For each of the above statistical analysis, we will need to use Hypothesis testing methods. Hypothesis testing tells us whether there exists statistically significant difference between the data sets for us to consider to represent different distribution. The difference that can be detected using hypothesis testing is:

Continuous Data
Difference in Average
Difference in Variation
Discrete Data
Difference in Proportion Defective

We follow the below steps for Hypothesis testing:

Step 1 : Determine appropriate Hypothesis test
Step 2 : State the Null Hypothesis Ho and Alternate Hypothesis Ha
Step 3 : Calculate Test Statistics / P-value against table value of test statistic
Step 4 : Interpret results – Accept or reject Ho

The mechanism of Hypothesis testing involves the following:

Ho = Null Hypothesis – There is No statistically significant difference between the two groups
Ha = Alternate Hypothesis – There is statistically significant difference between the two groups

We also have different types of errors that can be caused if we are using hypothesis testing. The errors are as noted below:

Type I Error – P (Reject Ho when Ho is true) = ?
Type II Error – P (Accept Ho when Ho is false) = ?

P Value – Statistical Measure which indicates the probability of making an ? error. The value ranges between 0 and 1. We normally work with 5% alpha risk, a p value lower than 0.05 means that we reject the Null hypothesis and accept alternate hypothesis.

Let’s talk a little about p-value. It is a Statistical Measure which indicates the probability of making an ? error. The value ranges between 0 and 1. We normally work with 5% alpha risk. ? should be specified before the hypothesis test is conducted. If the p-value is > 0.05, then Ho is true and there is no difference in the groups (Accept Ho). If the p-value is < 0.05, then Ho is false and there is a statistically significant difference in the groups (Reject Ho).

We will also discuss about the types of hypothesis testing:

1-Sample t-test: It’s used when we have Normal Continuous Y and Discrete X. It is used for comparing a population mean against a given standard. For example: Is the mean Turn Around Time of thread i‚?15 minutes.
2-Sample t-test: It’s used when we have Normal Continuous Y and Discrete X. It is used for comparing means of two different populations. For example: Is the mean performance of morning shift = mean performance of night shift.
ANOVA: It’s used when we have Normal Continuous Y and Discrete X. It is used for comparing the means of more than two populations. For example: Is the mean performance of staff A = mean performance of staff B = mean performance of staff C.
Homogeneity Of Variance: It’s used when we have Normal Continuous Y and Discrete X. It is used for comparing the variance of two or more than two populations. For example: Is the variation of staff A = variation of staff B = variation of staff C.
Mood’s Median Test: It’s used when we have Non-normal Continuous Y and Discrete X. It is used for Comparing the medians of two or more than two populations. For example: Is the median of staff A = median of staff B = median of staff C.
Simple Linear Regression: It’s used when we have Continuous Y and Continuous X. It is used to see how output (Y) changes as the input (X) changes. For example: If we need to find out how staff A’s accuracy is related to his number of years spent in the process.
Chi-square Test of Independence: It’s used when we have Discrete Y and Discrete X. It is used to see how output counts (Y) from two or more sub-groups (X) differ. For example: If we want to find out whether defects from morning shift are significantly different from defects in the evening shift.

Let’s look at each of the analysis for our research:

Impact of Age on Accuracy

Practical Problem

Hypothesis

Statistical Tool Used

Conclusion

Is Accuracy impacted by Age of Employees

H0: Accuracy is independent of the Age of Employees

H1: Accuracy is impacted by Age of Employees

One-Way ANOVA

p-value < 0.05 indicates that performance measure of accuracy is impacted by age factor

One-way ANOVA: Accuracy versus Age Bucket

Source DF SS MS F P

Age Bucket 2 0.50616 0.25308 67.62 0.000

Error 72 0.26946 0.00374

Total 74 0.77562

S = 0.06118 R-Sq = 65.26% R-Sq(adj) = 64.29%

Individual 95% CIs For Mean Based on

Pooled StDev

Level N Mean StDev ——+———+———+———+—

20 – 30 years 26 0.75448 0.06376 (—*–)

30 – 40 years 26 0.85078 0.07069 (—*–)

40 – 50 years 23 0.95813 0.04416 (—*—)

——+———+———+———+—

0.770 0.840 0.910 0.980

Pooled StDev = 0.06118

Boxplot of Accuracy by Age Bucket

Conclusion: P-value of the above analysis < 0.05 which indicates that we reject the null hypothesis and thus, the performance measure of accuracy is impacted by age of employees. As the age increases, we observe that the accuracy of the employees also increases.

Impact of Experience on Accuracy

Practical Problem

Hypothesis

Statistical Tool Used

Conclusion

Is Accuracy impacted by Experience of Employees

H0: Accuracy is independent of the Experience of Employees

H1: Accuracy is impacted by Experience of Employees

One-Way ANOVA

p-value < 0.05 indicates that performance measure of accuracy is impacted by experience factor

One-way ANOVA: Accuracy versus Experience Bucket

Source DF SS MS F P

Experience Bucke 2 0.53371 0.26685 79.42 0.000

Error 72 0.24191 0.00336

Total 74 0.77562

S = 0.05796 R-Sq = 68.81% R-Sq(adj) = 67.94%

Individual 95% CIs For Mean Based on

Pooled StDev

Level N Mean StDev ——-+———+———+———+–

0 – 10 years 24 0.74403 0.05069 (–*—)

10 – 20 years 23 0.84357 0.05354 (—*–)

20 – 30 years 28 0.94696 0.06660 (–*–)

——-+———+———+———+–

0.770 0.840 0.910 0.980

Pooled StDev = 0.05796

Boxplot of Accuracy by Experience Bucket

Conclusion: P-value of the above analysis < 0.05 which indicates that we reject the null hypothesis and thus, the performance measure of accuracy is impacted by experience of employees. As the experience increases, we observe that the accuracy of the employees also increases.

Impact of Age on Customer Satisfaction

Practical Problem

Hypothesis

Statistical Tool Used

Conclusion

Is Customer Satisfaction Score impacted by Age of Employees

H0: Customer Satisfaction Score is independent of the Age of Employees

H1: Customer Satisfaction Score is impacted by Age of Employees

One-Way ANOVA

p-value < 0.05 indicates that performance measure of Customer Satisfaction score is impacted by age factor

One-way ANOVA: Customer Satisfaction versus Age Bucket

Source DF SS MS F P

Age Bucket 2 49.51 24.75 18.92 0.000

Error 72 94.23 1.31

Total 74 143.74

S = 1.144 R-Sq = 34.44% R-Sq(adj) = 32.62%

Individual 95% CIs For Mean Based on

Pooled StDev

Level N Mean StDev ———+———+———+———+

20 – 30 years 26 6.906 1.164 (—-*—–)

30 – 40 years 26 8.041 1.156 (—–*—-)

40 – 50 years 23 8.907 1.107 (—–*—–)

———+———+———+———+

7.20 8.00 8.80 9.60

Pooled StDev = 1.144

Boxplot of Customer Satisfaction by Age Bucket

Conclusion: P-value of the above analysis < 0.05 which indicates that we reject the null hypothesis and thus, the performance measure of Customer Satisfaction Score is impacted by age of employees. As the age increases, we observe that the Customer Satisfaction Score of the employees also increases.

Impact of Experience on Customer Satisfaction

Practical Problem

Hypothesis

Statistical Tool Used

Conclusion

Is Customer Satisfaction Score impacted by Experience of Employees

H0: Customer Satisfaction Score is independent of the Experience of Employees

H1: Customer Satisfaction Score is impacted by Experience of Employees

One-Way ANOVA

p-value < 0.05 indicates that performance measure of Customer Satisfaction score is impacted by experience factor

One-way ANOVA: Customer Satisfaction versus Experience Bucket

Source DF SS MS F P

Experience Bucke 2 51.20 25.60 19.92 0.000

Error 72 92.54 1.29

Total 74 143.74

S = 1.134 R-Sq = 35.62% R-Sq(adj) = 33.83%

Individual 95% CIs For Mean Based on

Pooled StDev

Level N Mean StDev ——–+———+———+———+-

0 – 10 years 24 7.035 1.277 (—–*—–)

10 – 20 years 23 7.570 0.922 (—–*—–)

20 – 30 years 28 8.948 1.160 (—-*—-)

——–+———+———+———+-

7.20 8.00 8.80 9.60

Pooled StDev = 1.134

Boxplot of Customer Satisfaction by Experience Bucket

Conclusion: P-value of the above analysis < 0.05 which indicates that we reject the null hypothesis and thus, the performance measure of Customer Satisfaction Score is impacted by experience of employees. As the experience increases, we observe that the Customer Satisfaction Score of the employees also increases.

Impact of Age on Productivity

Practical Problem

Hypothesis

Statistical Tool Used

Conclusion

Is Productivity impacted by Age of Employees

H0: Productivity is independent of the Age of Employees

H1: Productivity is impacted by Age of Employees

One-Way ANOVA

p-value < 0.05 indicates that performance measure of Productivity is impacted by experience factor

One-way ANOVA: Productivity versus Age Bucket

Source DF SS MS F P

Age Bucket 2 0.74389 0.37194 194.56 0.000

Error 72 0.13765 0.00191

Total 74 0.88153

S = 0.04372 R-Sq = 84.39% R-Sq(adj) = 83.95%

Individual 95% CIs For Mean Based on

Pooled StDev

Level N Mean StDev ——+———+———+———+—

20 – 30 years 26 0.93959 0.04287 (-*–)

30 – 40 years 26 0.81511 0.05831 (-*-)

40 – 50 years 23 0.69291 0.01747 (–*-)

——+———+———+———+—

0.720 0.800 0.880 0.960

Pooled StDev = 0.04372

Boxplot of Productivity by Age Bucket

Conclusion: P-value of the above analysis < 0.05 which indicates that we reject the null hypothesis and thus, the performance measure of Productivity is impacted by age of employees. As the age increases, we observe that the Productivity of the employees decreases.

Impact of Experience on Productivity

Practical Problem

Hypothesis

Statistical Tool Used

Conclusion

Is Productivity impacted by Experience of Employees

H0: Productivity is independent of the Experience of Employees

H1: Productivity is impacted by Experience of Employees

One-Way ANOVA

p-value < 0.05 indicates that performance measure of Productivity is impacted by experience factor

One-way ANOVA: Productivity versus Experience Bucket

Source DF SS MS F P

Experience Bucke 2 0.74024 0.37012 188.61 0.000

Error 72 0.14129 0.00196

Total 74 0.88153

S = 0.04430 R-Sq = 83.97% R-Sq(adj) = 83.53%

Individual 95% CIs For Mean Based on

Pooled StDev

Level N Mean StDev –+———+———+———+——-

0 – 10 years 24 0.94474 0.03139 (–*–)

10 – 20 years 23 0.83120 0.05754 (–*-)

20 – 30 years 28 0.70599 0.04118 (–*-)

–+———+———+———+——-

0.700 0.770 0.840 0.910

Pooled StDev = 0.04430

Boxplot of Productivity by Experience Bucket

Conclusion: P-value of the above analysis < 0.05 which indicates that we reject the null hypothesis and thus, the performance measure of Productivity is impacted by experience of employees. As the experience increases, we observe that the Productivity of the employees decreases.

Conclusion of the Analysis:

As Age and Experience increases, the Accuracy and Customer Satisfaction Scores of Employees increases
As Age and Experience increases, the Productivity of Employees decreases

Bibliography:

The data used in this analysis is self-created data using statistical software.

Research Schedule (Gantt Chart) of the Project:

Quantitative Reasoning and Analysis: An Overview

Frances Roulet

State the statistical assumptions for this test.

Frankfort-Nachmias & Nachmias (2008) refers to the statistical inference as the procedure about population characteristics based on a sample result. In the understanding of some of these characteristics of the population, a random sample is taken, and the properties of the same is study, therefore, concluding by indicating if the sample are representative of the population.

An estimator function must be chosen for the characteristic of the population to be a study. Once the estimator function is applied to the sample, the results will an estimate. When using the appropriate statistical test, it can determine whether this estimate is based only on chance and if so, this will be called the null hypothesis and symbolized as H0 (Frankfort-Nachmias & Nachmias, 2008). This is the hypothesis that is tested directly, and if rejected as being unlikely, the research hypothesis is supported. The complement of the null hypothesis is known as the alternative hypothesis. This alternative hypothesis is symbolized as Ha. The two hypothesis are complementary; therefore, it is sufficient to define the null hypothesis.

According to Frankfort-Nachmias & Nachmias (2008) the need for two additional hypotheses arises out of a logical necessity. The null hypothesis responded to the negative inference in order to avoid the fallacy of affirming the consequent; in other words the researcher is required to eliminate the false hypotheses instead of accepting true ones.

Once the null hypothesis has been formulated, the researcher continues to test it against the sample result. The investigator, test the null hypothesis by comparing the sample result to a statistical model that provides the probability of observing such a result. This statistical model is called as the sampling distribution (Frankfort-Nachmias & Nachmias, 2008). Sampling distribution allows the researcher to estimate the probability of obtaining the sample result. This probability is well known as the level of significance or symbolically designated as ? (alpha); which, is also the probability of rejecting a true hypothesis, H0 is rejected even though it is true (false positive) becomes Type I error. Normally, a significance level of ? = .05 is used (even though at times other levels such as ? = .01 may be used). This means that we are willing to tolerate up to 5% of type I errors. The probability value (value p) of the statistic used to test the null hypothesis, considering that, p

The most common approach for testing a null hypothesis is to select a statistic based on a sample of fixed size, calculate the value of the statistic for the sample and then reject the null hypothesis if and only the statistic falls in the critical region. The statistical test may be one-tailed or two-tailed. In a one-tailed hypothesis testing specifies a direction of the statistical test, extreme results lead to the rejection of the null hypothesis and can be located at either tail (Zaiontz, 2015). An example of this is observed in the following graphic:

Figure 1 – Critical region is the right tail

The critical value here is the right (or upper) tail. It is quite possible to have one-sided tests where the critical value is the left (or lower) tail.

In a two-tailed test, the region of rejection is located in both the left and right tails. Indeed, the two-tailed hypothesis testing doesn’t specify a direction of the test.

An example of this is illustrated graphically as follows:

Figure 2 – Critical region is the left tail.

This possibility is being taken care as a two-tailed test using with the critical region and consisting of both the upper and lower tails. The null hypothesis is rejected if the test statistic falls in either side of the critical region. And, to achieve a significance level of ?, the critical region in each tail must have a size ?/2.

The statistical power is 1-?, the power is the probability of rejecting a false null hypothesis. While the significance level for Type I error of ? =.05 is typically used, generally the target for ? is .20 or .10 and .80 or .90 is used as the target value for power (Zaiontz, 2015).

When reading of the effect size, it is important to comprehend that an effect is the size of the variance explained by statistical model. This situation is opposed to the error, which is the size of the variance not explained by the model. The effect size is a standardized measure of the magnitude of an effect. As it is standardized, by comparing the effects across different studies with different variables and different scales can be done. For example, the differences in the mean between two groups can be expressed in term of the standard deviation. The effect size of 0.5 signifies that the difference between the means is half of the standard deviation. The most common measures of effect size are Cohen’s d, Pearson’s correlation coefficient r, and the odds ratio, even though there are other measures also to be used.

Cohen’s d is a statistic which is independent of the sample size and is defined as , where m1 and m2 represent two means and ?pooled is some combined value for the standard deviation (Zaiontz, 2015).

The effect size given by d is normally viewed as small, medium or large as follows:

d = 0.20 – small effect

d = 0.50 – medium effect

d = 0.80 – large effect

The following value for d in a single sample hypothesis testing of the mean:

.

The main goal is to provide a solid sense of whether a difference between two groups is meaningfully large, independent of whether the difference is statistically significant.

On the other hand, t Test effect size indicates whether or not the difference between two groups’ averages is large enough to have practical meaning, whether or not it is statistically significant.

However, a t test questions whether a difference between two groups’ averages is unlikely to have occurred because of random chance in sample selection. It is expected that the difference is more likely to be meaningful and “real” if:

the difference between the averages is large,
the sample size is large, and,
responses are consistently close to the average values and not widely spread out (the standard deviation is low).

A statistically significant t test result becomes one in which a difference between two groups is unlikely to have occurred because the sample happened to be atypical. Statistical significance is determined by the size of the difference between the group averages, the sample size, and the standard deviations of the groups. It is suggested, for practical purposes, that statistical significance suggests that the two larger populations from which the sample are “actually” different (Zaiontz, 2015).

The t test’s statistical significance and the t test’s effect size are the two primary outputs of the t test. The t statistic analysis is used to test hypotheses about an unknown population mean (µ) when the value of the population variance (?2) is unknown. The t statistic uses the sample variance (S2) as an estimate of the population variance (?2) (Zaiontz, 2015).

In the t test there are two assumptions that must be met, in order to have a justification for this statistical test:

Sample observations must be independent. In other words, there is no relationship between or among any of the observations (scores) in the sample.
The population from which a sample has been obtained must be normally distributed.
Must have continuous dependent variable.
Dependent variable has a normal distribution, with the same variance, ?2, in each group (as though the distribution for group A were merely shifted over to become the distribution for group B, without changing shape):

Bottom of Form

Top of Form

Bottom of Form

Bottom of Form

Note: ?, “sigma”, the scale parameter of the normal distribution, also known as the population standard deviation, is easy to see on a picture of a normal curve. Located one ? to the left or right of the normal mean are the two places where the curve changes from convex to concave (the second derivative is zero) (Zaiontz, 2015).

The data set selected was from lesson 24.

The independent variable: Talk.

The dependent variable: Stress.

Hypotheses.

Null hypothesis: H0: 1- 2 = 0;

There is no difference between talk and level of stress.

In the null hypothesis, the Levene’s Test for equality of variances is H0: p= 0.5.

Alternative hypothesis: Ha: 1- 2 = 0;

There is difference between level of stress and talk.

In the alternative hypothesis, the Levene’s Test for equality of variances is Ha: p<> 0.5.

Statistical Report

The group statistics of the independent samples test indicated that low stress (n=15, m=45.20, SD=24.969, SE=6.447) scored higher than the high stress (n=15, m=22.07, SD=27.136, SE=7.006).

Group Statistics

stress

N

Mean

Std. Deviation

Std. Error Mean

talk

Low Stress

15

45.20

24.969

6.447

High Stress

15

22.07

27.136

7.006

The sample size of this results is of N=30, and its TS=t30=2.430; pvalue=.881 <.05 provided evidence to retain the null hypothesis. Therefore it is not statistically significant, if p.05 < indicating that the variances or standard deviations are the same in all probability.

Independent Samples Test

Levene’s Test for Equality of Variances

t-test for Equality of Means

F

Sig.

t

df

Sig. (2-tailed)

Mean Difference

Std. Error Difference

95% Confidence Interval of the Difference

Lower

Upper

Talk

Equal variances assumed

.023

.881

2.430

28

.022

23.133

9.521

3.630

42.637

Equal variances not assumed

2.430

27.808

.022

23.133

9.521

3.624

42.643

Levene’s Test for equality of variances reports that the P<.05 >P.05, the analysis for this test p=.881 is significance, therefore, is assumed that the variances are equal. There is no evidence that the variances of the two groups are different from each other. In comparing the P-value of .881 to .05 we find that there is no evidence to reject the null hypothesis, therefore, the H0 is not rejected.

Graph.

SPSS syntax and output files.

T-Test

T-TEST GROUPS=stress(1 2)

/MISSING=ANALYSIS

/VARIABLES=talk

/CRITERIA=CI(.95).

Notes

Output Created

28-JAN-2015 00:27:21

Comments

Input

Data

C:UsersFrances RoulAppDataLocalTempTemp1_new_datasets_7e-10.zip

ew_datasets_7e

ew_datasets_7eLesson 24 Data File 1.sav

Active Dataset

DataSet3

Filter

Weight

Split File

N of Rows in Working Data File

30

Missing Value Handling

Definition of Missing

User defined missing values are treated as missing.

Cases Used

Statistics for each analysis are based on the cases with no missing or out-of-range data for any variable in the analysis.

Syntax

T-TEST GROUPS=stress(1 2)

/MISSING=ANALYSIS

/VARIABLES=talk

/CRITERIA=CI(.95).

Resources

Processor Time

00:00:00.02

Elapsed Time

00:00:00.01

[DataSet3] C:UsersFrances RoulAppDataLocalTempTemp1_new_datasets_7e-10.zipnew_datasets_7enew_datasets_7eLesson 24 Data File 1.sav

GGraph

* Chart Builder.

GGRAPH

/GRAPHDATASET NAME=”graphdataset” VARIABLES=stress talk MISSING=LISTWISE REPORTMISSING=NO

/GRAPHSPEC SOURCE=INLINE.

BEGIN GPL

SOURCE: s=userSource(id(“graphdataset”))

DATA: stress=col(source(s), name(“stress”), unit.category())

DATA: talk=col(source(s), name(“talk”))

GUIDE: axis(dim(1), label(“stress”))

GUIDE: axis(dim(2), label(“talk”))

GUIDE: text.title(label(“Independent -Sample t-Test Graph”))

SCALE: cat(dim(1), include(“1”, “2”))

SCALE: linear(dim(2), include(0))

ELEMENT: interval(position(stress*talk), shape.interior(shape.square))

END GPL.

Notes

Output Created

28-JAN-2015 01:31:16

Comments

Input

Data

C:UsersFrances RoulAppDataLocalTempTemp1_new_datasets_7e-10.zip

ew_datasets_7e

ew_datasets_7eLesson 24 Data File 1.sav

Active Dataset

DataSet3

Filter

Weight

Split File

N of Rows in Working Data File

30

Syntax

GGRAPH

/GRAPHDATASET NAME=”graphdataset” VARIABLES=stress talk MISSING=LISTWISE REPORTMISSING=NO

/GRAPHSPEC SOURCE=INLINE.

BEGIN GPL

SOURCE: s=userSource(id(“graphdataset”))

DATA: stress=col(source(s), name(“stress”), unit.category())

DATA: talk=col(source(s), name(“talk”))

GUIDE: axis(dim(1), label(“stress”))

GUIDE: axis(dim(2), label(“talk”))

GUIDE: text.title(label(“Independent -Sample t-Test Graph”))

SCALE: cat(dim(1), include(“1”, “2”))

SCALE: linear(dim(2), include(0))

ELEMENT: interval(position(stress*talk), shape.interior(shape.square))

END GPL.

Resources

Processor Time

00:00:00.14

Elapsed Time

00:00:00.14

[DataSet3] C:UsersFrances RoulAppDataLocalTempTemp1_new_datasets_7e-10.zipnew_datasets_7enew_datasets_7eLesson 24 Data File 1.sav

References

Frankfort-Nachmias, C., & Nachmias, D. (2008). Research methods in the social sciences. (Seventh Edition). New York, N.Y.: Worth Publishers.

Zaiontz, C. (2015). Real statistics using Excel. www.real-statistics.com

Laureate Education, Inc. (Executive Producer). (2009n). The t test for related samples. Baltimore: Author.

Prescription Drug Abuse

O’Neil, Michael, and Karen L. Hannah. “Understanding the cultures of prescription drug abuse, misuse, addiction, and diversion.” West Virginia Medical Journal, vol. 106, no. 4, 2010, p. 64+. AcademicOneFile, go.galegroup.com/ps/i.do?p=AONE&sw=w&u=lom_accessmich&v=2.1&id=GALE%7CA237942597&it=r&asid=cf3d399c91b954af8322f68a7a6d999a. Date accessed 24 Feb. 2017.

“Prescription Drugs.” NIDA for Teens, USA.gov, National Institute on Drug Abuse,

teens.drugabuse.gov/drug-facts/prescription-drugs. Date accessed 24 Feb. 2017.

“Resources.” Health in Aging, www.healthinaging.org/resources/resource:eldercare-at-home-pain/.A Date accessed 26 Feb. 2017.

Roes, Nicholas A. “When anger complicates recovery.” Addiction Professional, Nov. 2007, p. 48+. Health Reference Center Academic, go.galegroup.com%2Fps%2Fi.do%3Fp%3DHRCA%26sw%3Dw%26u%3Dlom_accessmich%26v%3D2.1%26id%3DGALE%257CA172176738%26it%3Dr%26asid%3D57e34cb3d45dbadee3b3b8596892f346. Accessed 2 Mar. 2017.

“The Effects of Painkillers on the Brain and Body.” Maryland Addiction Recovery Center, 12 Feb. 2015, www.marylandaddictionrecovery.com/effects-of-painkiller-on-the-brain-and-body. Date accessed 28 Feb. 2017.

Parenting Styles in early childhood

Parenting Style as a Mediator between Children’s Negative Emotionality and Problematic Behavior in Early Childhood

Abstract

Parenting style is of particular interest in the negative emotional development leading to difficult behavior in children. This paper evaluates research focused on the impact parenting has on children’s negative behavior. The objective was to determine the affects of authoritative and authoritarian parenting as it relates to negative behavior in children. Comparisons will be made to several studies showing similar results. The objective, procedures and results will be evaluated to determine the strength of the research conducted and the validity of the study. Even with limitations, the research does in fact support that authoritative parenting – which is firm but loving – is more effective at helping children not act out than is authoritarian parenting, which emphasizes compliance and conformity.

Introduction

Anyone who has ever spent time with preschool children knows that the lives of such young people are marked both by negative emotions and by acting out (often described as “temper tantrums”). Both are typical and age appropriate. However, also age appropriate to the preschool cohort is the need to begin to learn how to regulate their behavior. While young children have some ability to be self-regulating (as opposed to infants), they lack the cognitive and emotional skills to be able to do so on their own in any consistent matter. Thus one of the tasks of parenting preschool-aged children is to help them learn to separate negative emotions from negative actions.

Key to this process is teaching children that negative emotions are perfectly acceptable. The parenting style that is best geared to teaching both aspects of this – that negative emotions are natural but that negative acting out is not acceptable – is the authoritative parenting style. In contrast, an authoritarian parenting style can be fundamentally harmful to the process of teaching young children to honor but contain their negative emotions such as anger, fear, and dislike.

Authoritarian parenting is marked by the parents’ having very high expectations of compliance to the rules that they put into place and a high level of conformity to the parents’ beliefs. Authoritarian parents tend to give commands rather than explanations. Authoritative parents also set standards and hold expectations for their children but also allow an appropriate amount of independence on the part of the child and allows for questioning and discussion.

Statement of the problem

The problem explored in by the research focused on here is how may parents help young children learn how to separate their negative emotions (especially anger and frustration, both very common – and entirely acceptable – emotions at this stage of life). Parents may often find themselves both angry and frustrated at the child who turns around and bites a friend on the playground or who collapses onto the grocery store floor when denied an especially sugary treat and respond in much the same way as their children – yelling back and losing their own tempers. This is hardly an effective response.

The most effective response, according to the research examined here, is for parents to help their children understand their emotions, put words to those emotions, and to find appropriate ways to act out their emotions – perhaps by tearing paper into small pieces, building up towers of blocks and knocking them over, etc. Parents who help their children separate negative emotions from negative actions are authoritative, allowing children to ask questions and receive honest answers. Parents who insist on compliance and conformity tend to exacerbate their children’s negative behavior.

The hypothesis that this paper examines is the following: An authoritative parenting style helps reduce negative behaviors in preschool children that are associated with negative emotions.

Literature Review

The research summarized here fully supports the idea that parents using an authoritative style are more successful at helping their children reduce their negative behaviors than are parents using an authoritarian style. Paulessen-Hoogeboom et al (2008) found that while young children will act out in negative ways at times regardless of parenting style (this is only to be expected at this developmental stage), authoritative parenting helped reduced this behavior. In other words, “that the relations between child negative emotionality and internalizing and externalizing behaviors were partially mediated by mothers’ authoritative parenting style” (p. 209).

Moreover, when the authors used confirmatory factor analysis to decontaminate possible overlap in item content between measures assessing temperament and problematic behavior, the association between negative emotionality and internalizing behavior was fully mediated by authoritative parenting. (p.209)

The researchers used the following definition for authoritative parenting: “Authoritative parenting is characterized by a combination of high warmth, firm but fair control, and the use of explanations and reasoning” (p. 212). They observed 98 male and 98 female children from two and a half to four years in Dutch daycare centers. They assessed the parents’ style of interaction with their children and determined how effective authoritarian and authoritative parents were in terms of helping their children disconnect negative emotions from negative “externalization”. They found that there was a statistically positive correlation between authoritative parenting and children’s ability to disconnect negative feelings from negative actions.

The study attempts to provide insight by measuring maternal perception of their children as it relates to their problematic behaviors both internal and external. In an effort to fill in gaps that exist in previous research studies, the focus was on 3 year old toddlers. In collaboration with child health centers in Holland, 196 preschool children and their mother were randomly selected through a letter distributed to 750 families from the health center. The researchers set out to find direct associations on negative emotions and higher levels of negative emotionality based on authoritarian parenting compared to authoritative parenting. The study intended to indirectly relate problematic behavior to the type of parenting style. Lastly, they wanted to show the association between decreased levels of SES in relation to the level of authoritative parenting and the internalizing and externalizing behaviors. (Figure 1, 2008)

Findings

Paulessen-Hoogeboom et al (2008) present us with a number of key findings that have such pervasive implications for parenting. All toddlers engage in behaviors such as biting, hitting, screaming, or otherwise acting out. Such behaviors arise as a result of negative emotions. Parents often find these behaviors hard to deal with – along with other children and other caregivers. The response by others in the children’s world may be highly negative itself and may thus provoke additional negative feelings, which in turn provoke additional negative behaviors. This is a cycle that is bad for all concerned.

Paulessen-Hoogeboom et al (2008) further validated the finding of others that an authoritarian parenting style is aimed at getting children to stop these negative behaviors by commanding them to follow parental orders. However, they also found, such a parenting style ignores the underlying emotions and so is ineffective in preventing the negative behaviors involved. Authoritative parents talk with their children about these emotions, help them understand that such emotions are natural and appropriate, and that there are better ways to express these feelings that will not be seen as negative by others. It is this key part – acknowledging emotions while helping children disconnect emotions from actions – that makes authoritative parenting effective in reducing negative actions.

In other words, parents and young children can work together (with the far greater amount of work being done by the parents, of course) to create a positive feedback system in which children learn to value their emotions while moderating their behavior.

The next important finding by Paulessen-Hoogeboom et al (2008) was that whatever elements of “personality” or “temperament” are innate, any inborn tendency to act out negatively is far less important than parenting style in terms of the behavior of children. IN other words, Paulessen-Hoogeboom et al (2008) found that authoritative parenting can overcome innate tendencies in children to act out. This is a very important finding for parents and other caregivers.

In this longitudinal study, research showed that while young children will act out in negative ways at times regardless of parenting style authoritative parenting helped reduced this behavior. (Paulessen-Hoogeboom, et al, 2008) Using correlation and covariance showed in preliminary analysis there was no significant differences in the mean scores based on gender or birth-order variables. Using a variety of statistical analysis tools including chi-square, AGFI to measure the amount of variance and covariance the results indicated a good fit. The adjusted model, which omitted certain paths, resulted in removing the authoritarian parenting from the model. This revealed a negative association between emotionality and maternal authoritative parenting. (Figure 2, 2008)

Discussion

The study sets out to determine possible cause and link to children’s negativity emotionality and problematic behavior through a sample drawn from the general population. There was evidence that a child’s negative emotions and problematic behavior is related to parenting and is mediated by authoritative parenting from the maternal parent.

This research is echoed by others and in fact substantiates the body of research in this area. Similar findings were reported by Kochanska, Murray, & Coy (1997) found that mothers who scored high on sensitivity measures and responded quickly to requests made by their toddlers (that is, mothers who used an authoritative parenting style) were effective in limiting negative behavior on the part of their children. Both sensitivity and speed in responding to requests were made in response to children’s expressing negative emotions in words: The maternal response emphasized and supported the children’s use of verbal expression rather than physical acting out when the child felt negative emotions.

In this longitudinal study, one year after the researchers initially observed the toddlers, they found that the children rated higher on cooperativeness and prosocial behavior than did children who had parents with a less responsive style.

Kochanska, Murray, & Coy (1997) found that both outgoing and shy toddlers benefited from a responsive but firm parenting style. This finding is important because it suggests that parenting style can at least in some measure trump temperament or personality, or “Different socialization experiences can predict the same developmental outcomes for children with different predispositions, and a given socialization experience can predict divergent developmental for different children.”

Another study that that the groundwork for the work by Paulessen-Hoogeboom etal was Clark & Ladd (2000). In observing kindergarten-aged children and their mothers, they assessed the level of mutual warmth, happiness, reciprocity, and engagement. (They used these terms to operationalize the concept of authoritative parenting.) They found that children and mothers who scored high on all of these measures (and who thus met the requirements for an authoritative family) scored much higher on positive behavior regardless of internal emotional state. Both teachers and peers described these children as being more empathetic, more socially accepting and acceptable, as having more friends, and as having more harmonious relationships with both other children and adults.

The body of research in this area was confirmed and consolidated by Paulessen-Hoogeboom et al (2008). All three of these studies find clear, significantly statistical results between an authoritative parenting style and the ability of young children to contain negative emotions in an appropriate way. Paulessen-Hoogeboom et al (2008) summarized their findings:

The finding that an authoritative parenting style mediates the relations between negative emotionality and problematic behaviors underscores the importance of providing effective parenting support to parents who have difficulties in dealing with their young child’s negative emotionality on a daily basis.

When parents can be trained and encouraged to react to their children’s negative emotionality in an adaptive way, parent-child interactions may become more enjoyable, thereby reducing the occurrence of problematic behaviors and preventing more serious behavioral problems later in life (Campbell, 1995; Patterson, 1982). We note that even in general population samples, a substantial percentage of children (up to 10%) may develop internalizing- and externalizing-behavior problems in the clinical range. (p. 226)

In any research, you must consider any limitations that may affect the results of the study. In this study, there were several limitations to be noted. The correlation design set limits on the causal interpretation, some findings may be accounted for based on genetics, there was a not a diversity in socioeconomic backgrounds and the study only focused on one parent. The findings also revealed a significant association between increased negative emotionality associated with less supportive parenting and was more prevalent in lower socioeconomic backgrounds. (Paulussen-Hoogeboom, Stams, Hermanns, & Peetsma, 2007).

Conclusion

The findings of Paulessen-Hoogeboom et al (2008) reveal that young children can be helped by authoritative parenting to disengage negative emotions from negative behavior. This is a lesson that has immense value for the entire lifespan. Through authoritative parenting, mothers were able to help them understand that such emotions are natural and appropriate, and that there are better ways to express these feelings that will not be seen as negative by others. These findings are consistent with other studies that have been done. The study is not without limitation but still successfully supports the hypothesis presented.

References

Grazyna Kochanska,Kathleen Murray,&Katherine C Coy.(1997). Inhibitory control as a contributor to conscience in childhood: From toddler to early school age.Child Development,68(2),263-277. Retrieved February 23, 2010, from Career and Technical Education. (Document ID:12543990).

Karen E Clark,&Gary W Ladd.(2000). Connectedness and autonomy support in parent-child relationships: Links to children’s socioemotional orientation and peer relationships.Developmental Psychology,36(4),485-498. Retrieved February 23, 2010, from Research Library. (Document ID:56531644).

Marja C Paulussen-Hoogeboom,Geert Jan J M Stams,Jo M A Hermanns,&Thea T D Peetsma.(2007). Child Negative Emotionality and Parenting From Infancy to Preschool: A Meta-Analytic Review.Developmental Psychology,43(2),438. Retrieved February 23, 2010, from Research Library. (Document ID:1249797641).

Paulussen-Hoogeboom,M.,Stams,G.,Hermanns,J.,Peetsma,T.,&van den Wittenboer,G..(2008). Parenting Style as a Mediator Between Children’s Negative Emotionality and Problematic Behavior in Early Childhood.The Journal of Genetic Psychology,169(3),209-26. Retrieved February 23, 2010, from Research Library. (Document ID:1548809441).

Analysis of Obesity in the UK

Obesity in England: Reason & Consequences

Generally, the objective of this statistics report is to evaluate the obesity in England.

1.0 Abstract

The main purpose of this report, is to identify the statistics analytical report regarding ‘Obesity in England’ that is specifically based on the physical activity and the lifestyles of people in England. In addition to the objective of this report, is to highlight the fact that peoples’ physical activities and lifestyles are changing year by year. Additionally, this report will analyse the obesity statistics of the population in England. The report will then discuss about the physical activity of the population relating to obesity in England. In order to ease the understanding of the reader, historical tables and pie charts will be included in this report which can also help readers to make comparisons between the obesity rate, physical activities and lifestyle statistics.

2.0 Introduction

Figure 1 represents the calculation formula of BMI with different units of measurements. The unit of ‘masses in BMI can be applied by using Kilograms (kg), Pounds (lbs), or Stones (st). However, the SI units for BMI is still remain on kilograms.

(Figure 1)

Obesity can be defined as an individual who is overweight with a significant degree of body fat and fatty acid (NHS, 2012). In the past twenty five years, the occurrence of obesity in England, was measured and studies found that the statistical records, had doubled the figures from the past years (Publich Health England, 2014). There are several reasons that could cause obesity to happen. The two main factors influencing obesity are, due to lack of physical activities and lifestyles. Obesity is undoubtedly harmful for an individual’s health. An individual who face obesity, may encounter some severe health issues such as diabetes, strokes, heart disease and even some common cancer such as breast cancer or colon cancer (NHS, 2012). The question is, how can one determine whether an individual is considered obese or not?

An individual’s weight can be measured in various ways and measurement to determine the severity of overweight. However, according to the United Kingdom’s National Health Service (NHS), the method that is widely practice for body weighting is the body mass index (BMI). By using the calculations in (Figure 1), individuals can acknowledge whether he/she is overweight or obese. BMI overweight severity is separated into a few categories. For instances, individuals with BMI range of 25-29 would be considered as overweight, while individuals who falls in the second category with BMI between 30 and 40, would be considered as obese, followed by people who has his/her BMI over 40, would be considered as unhealthy obesity (NHS, 2012).

This report will provide essential statistics data to give a bigger picture of obesity in England for readers. The statistics will be supported with graphs, tables and pie charts that will be included as well to demonstrate a better illustration of the comparison between the variables. Last but not least, by the end of the report, readers will understand the potential reason of obesity in terms of physical activities and consequences of obesity.

3.0 Methodology

The information that was used in this report, were collected through various types of sources such as online journals, articles, internet and books. These sources were done using secondary data. In addition, several reliable websites and annual reports of official institutions were used to interpret and analyse the data and was converted into information to discuss this statistics report. The websites that were used in this report consists of Guardian, Telegraph, and National Health Service (NHS). Furthermore, regarding to the obesity’s data and information, the data were mainly obtained from the reports published by NHS in order to improve the creditability and reliability of this report. In short, the information, data and materials in this report are extremely genuine, trustworthy and reliable.

4.0 Findings

4.1 Statistics of obesity in England by age group (2002 to 2012)

(Graph 1)

Source: Hospital Episode Statistics (HES), Health & Social Care Information Centre (2014).

According to (Graph 1) above, the graph specifically shows the statistics of obesity in England from year 2002 to 2013 according to age groups from the age of 16. The statistics showed that the obesity’s population in England, is trending up from 2002 to 2013 for all age group (16 to 74 and over). In 2002, there was a record of 29,237 people facing obesity while in 2003 the obesity rate had significantly increased to 33,546 people which calculated 14.74% change. During that moment, the population of obesity in England rose rapidly from year 2004 to 2009 with 21.45%, 27.68%, 29.20%, 20.39%, 27.28% and 38.90% increase respectively. In population, the numbers of people suffering from obesity, had gone up dramatically from 40,741 to 142,219 people.

By comparing to year 2009, the percentage change of the obesity’s population had reached its peak which is 48.91% in 2010. There was a record of 211,783 individual which are obese from the age of 16 to 74 and over. Additionally, the statistics of people facing obesity in England climbed up to 266,666 with a 25.91% change comparing to year 2010. Last but not least, the total population of obesity in England in year 2012, had reached up to 292,404 people. However, this increase had accounted to only 9.65% change in population of obesity. In the bigger picture, the population for obesity in England had been escalated from year 2002 to 2013 with an increase of massive 900%.

4.2 Obesity between men and women in England (Year 2002-2012)

(Graph 2)

Source: Hospital Episode Statistics (HES), Health & Social Care Information Centre (2014).

As you can see, (Graph 2) represents the obesity’s population between the men and women in England. The graph shows a significant uptrend formed with the recorded statistics of obesity’s population. Other than that, you can see the difference between the obese men and women. The difference between the men and women that are obese, showed that both genders were increasing year by year. In 2002, the number of women who suffered from obesity (17,169 people) were 5,100 people higher than the number of obese men (12,068 people).

Furthermore, in 2007, the number of obese women (48,829 people) had a 16,749 people of difference compared to the obese men which was tripled the result of year 2002. Nevertheless, the most significant data recorded was in year 2012. The population of women being obese (192,795 people), was approximately twice as many as the population of obese men (99,579 people).

In result, we can conclude that regarding to England’s obesity’s population, the number of women who suffered from obesity are higher than men. According to the research, lack of physical activity were the cause of obesity.

5.0 Physical activity

Physical activity is known to bring healthy benefits to individuals and it is proved that this will reduce incidence of many chronic conditions such as obesity (HSCIC, 2012). However, individuals that are lack of physical activity may suffer from obesity.

5.1 Physical activity guidelines

MPA (minutes/week)

VPA (minutes/week)

Active

150

10 < or a‰¦ 75

Some activity

60-149

30-74

Low activity (Overweight)

30-59

15-29

Inactive (Obese)

< 30

< 15

MPA: Moderate intensity Physical Activity

VPA: Vigorous intensity Physical Activity

(Figure 2)

Source: Hospital Episode Statistics (HES), Health & Social Care Information Centre (2014).

HSCIC (2012) had set up a standard for physical activity guidelines as shown in (Figure 2). The activities are divided into four categories to determine whether an individual is active or inactive. Individuals must meet the requirements of at least either MPA or VPA or both in order to fall into that category.

5.2 Self-reported physical activity of men and women

(Chart 1) (Chart 2)

Source: Hospital Episode Statistics (HES), Health & Social Care Information Centre (2014).

HSCIC (2012) stated that individuals must have at least 30 MPA in order to get rid of obesity. Low activity and inactive individual will be considered as overweight and obese. Chart 1 and Chart 2 are the pie charts that represent the self-reported physical activity data that HSCIC (2012) collected. According to both of the figures, the percentage of active individuals in terms of physical activity of men (67%) is obviously more than the women (55%) by a difference of 12%. Relatively, 26% of women in year 2012 are inactive regarding to their physical activity. Furthermore, the percentage of low activity of women is slightly (2%) higher than men. In contrast, the inactive population of men in their physical activity was just 19% which is 7% lower than the women.

In comparison, the percentages of inactive women are higher than inactive men whereas the percentages of active men are higher than the women. In short, since the individuals that fall in the ‘low activity’ and ‘inactive’ category, are considered to be overweight and obese. Therefore, referring to (Figure 3), we can conclude that physical activity is be one of the main reasons that caused obesity and it also showed why the population of obese women was more than men since year 2002 until 2012.

6.0 Comparative rates of adults’ obesity in 2010

(Graph 3)

Source: National Obesity Observatory, International Comparisons of Obesity Prevalence, available at: www.noo.org.uk/NOO_about_obesity/international/

Graph 3 shows latest data of comparative rates of adult’s obesity in year 2010. As we can see, the country’s highest obesity prevalence is the United States (35.70%). This is followed by Mexico, Scotland and New Zealand coming in second, third and fourth place accordingly with the obesity prevalence of 30%, 28.20% and 26.50% respectively. England’s obesity prevalence is 26.10% which is considered high by comparing to countries such as Australia (24.60%), Northern Ireland (23%), Luxembourg (22.50%) and Slovak Republic (16.90%). Last but not least, Japan and Korea have the least obesity prevalence by comparing to other countries in the graph shown; they have a percentage of 3.90% and 3.80% relatively. Ultimately, this graph shows that the obesity level of England which is considered severe.

6.1 Map of excess weight of England

Map 1 shows the percentage of adults that are involved in obesity from different regions of England.

Guardian (2014) stated that it has an average of 64% adults bringing obese in England by considering all the regions.

(Map 1)

7.0 Cost of Obesity

The cost of obesity, consists of human cost and National Health Service (NHS) cost. This session will discuss about both the cost for obesity.

Figure 2 shows the relative risk of women and men in terms of the diseases caused by obesity. The table consist of diseases that may cause hypertension, stroke and cancer. It can be seen that the relative figures of women, is higher by comparing to the men especially in the Type 2 Diabetes which is two times more of the probability. Type 2 Diabetes can cause serious life shortening that will affect the mortality of human being (NAO, 2011).

7.1 Human Cost of obesity

Disease

Relative risk – Women

Relative risk – men

Type 2 Diabetes

12.7

5.2

Hypertension

4.2

2.6

Myocardial Infarction

3.2

1.5

Cancer of the Colon

2.7

3

Angina

1.8

1.8

Gall Bladder Diseases

1.8

1.8

Ovarian Cancer

1.7

Osteoarthritis

1.4

1.9

Stroke

1.3

1.3

(Figure 2)

Source: National Audit Office estimates based on literature review

7.2 NHS Cost of Obesity

(Graph 4)

Source: National Audit Office estimates (2012)

Graph 4 shows the approximate obesity cost in 2012. It is estimated a spending of ?457m on obesity cost, is considered as a burden to the England’s economy. NAO (2012) estimated that the obesity cost for year 2015, will increase dramatically up to ?6.3 billion and up to ?9.7 billion by year 2050. The reason behind the cost of obesity will be significant high, is because of the indirect cost of lost output in economy. NAO (2001) stated that the economy will be in recession due to the sickness or death of the England’s workforce caused by obesity. Therefore, the consequences of obesity must not be ignored but must be taken into serious considerations.

8.0 Conclusion

In short, the statistics of this report identified some important details regarding obesity in England. It is important to understand how the impact of obesity and the growth of population can cause the increase of people with obesity to be two times more in the past 25 years. Furthermore, the trend for obesity in all different age groups, showed an increase in England from year 2002-2013. The differences between the genders as well, will show the reasons to why there is an increase in obesity in relations to physical activities because of the activeness of men, inactiveness of women and vice versa. Importantly, this report stated the consequences of obesity which is severe illnesses that causes death with related risk statistics about men and women. Lastly, the report showed the comparison between other countries related to obesity, the percentage of obesity in the regions of England, followed by the human and NHS cost of obesity.

9.0 Recommendations

As aforementioned, the level of obesity in England is getting more and more significant year by year. Government should conduct more campaign to fight obesity as it will provide more information about importance of physical activity in life to individuals or families. In addition, government should continue to subsidise NHS for the ‘Health Check programme’ in order to prevent and avoid severe disease such as heart disease, stroke, and cancer.

Besides, government should not just focus on physical activity; they must focus on other reason that causes obesity as well, such as diet and lifestyle. Government could implement some political strategy to fight obesity, such as increase the taxation of fat-food in order to stop people from buying the unhealthy product. Last but not least, government could also increase the advertising of healthy campaign and advertisement of disadvantages of obesity to encourage people to get rid of obesity.

10.0 References:

Boseley, S. (2014). The Guardian: Almost two-thirds of adults in England classed as overweight by health body. [Online] Available at: http://www.theguardian.com/society/2014/feb/04/two-thirds-adults-overweight-england-public-health [Last Accessed 28th March 2014].

National Health Service. (2014). Obesity: Introduction. [Online] Available at: http://www.nhs.uk/conditions/Obesity/Pages/Introduction.aspx [Accessed 27th March 2014].

Public Health England. (2014). Trends in Obesity Prevalence. [Online] Available at: http://www.noo.org.uk/NOO_about_obesity/trends [Accessed 20th March 2014].

Figure 1: Source :http://healthy-living.knoji.com/does-your-bmi-really-matter/

HSCIC. (2014). Statistics on Obesity, Physical Activity and Diet: England 2014. [Online] Available at: http://www.hscic.gov.uk/catalogue/PUB13648/Obes-phys-acti-diet-eng-2014-rep.pdf [Accessed 20th March 2014].

HSCIC. (2012). Physical activity in Adults. [Online] Available at: http://www.hscic.gov.uk/catalogue/PUB13218/HSE2012-Ch2-Phys-act-adults.pdf [Accessed 24th March 2014].

NAO. (2012). An Update on the Government’s Approach to Tackling Obesity. [Online] Available at: http://www.nao.org.uk/wp-content/uploads/2012/07/tackling_obesity_update.pdf [Accessed 24th March 2014].

HSCIC. (2012). Chapter 7: Health Outcomes. [Online] Available at: http://www.hscic.gov.uk/searchcatalogue?productid=13887&returnid=3945 [Accessed 24th March 2014].

NAO. (2001). Tackling Obesity in England. [Online] Available at: http://www.nao.org.uk/wp-content/uploads/2001/02/0001220.pdf [Accessed 28th March 2014].

Public Health England. (2013). Social Care and Obesity: A Discussion Paper. [Online] Available at: http://www.local.gov.uk/documents/10180/11463/Social+care+and+obesity+-+a+discussion+paper+-+file+1/3fc07c39-27b4-4534-a81b-93aa6b8426af [Accessed 29th March 2014].

HSCIC. (2012). Statistics on Obesity, Physical Activity and Diet: England, 2012. [Online] Available at: http://www.hscic.gov.uk/catalogue/PUB05131/obes-phys-acti-diet-eng-2012-rep.pdf [Accessed 20th March 2014].

HSCIC. (2013). Statistics on Obesity, Physical Activity and Diet: England, 2013. [Online] Available at: http://www.bhfactive.org.uk/userfiles/Documents/obes-phys-acti-diet-eng-2013-rep.pdf [Accessed 20th March 2014].

Normal Approximation in R-code

Normal approximation using R-code

Abstract

The purpose of this research is to determine when it is more desirable to approximate a discrete distribution with a normal distribution. Particularly, it is more convenient to replace the binomial distribution with the normal when certain conditions are met. Remember, though, that the binomial distribution is discrete, while the normal distribution is continuous. The aim of this study is also to have an overview on how normal distribution can also be concerned and applicable in the approximation of Poisson distribution. The common reason for these phenomenon depends on the notion of a sampling distribution. I also provide an overview on how Binomial probabilities can be easily calculated by using a very straightforward formula to find the binomial coefficient. Unfortunately, due to the factorials in the formula, it can easily lead into computational difficulties with the binomial formula. The solution is that normal approximation allows us to bypass any of these problems.

Introduction

The shape of the binomial distribution changes considerably according to its parameters, n and p. If the parameter p, the probability of “success” (or a defective item or a failure) in a single experimental, is sufficiently small (or if q = 1 – p is adequately small), the distribution is usually asymmetrical. Alternatively, if p is sufficiently close enough to 0.5 and n is sufficiently large, the binomial distribution can be approximated using the normal distribution. Under these conditions the binomial distribution is approximately symmetrical and inclines toward a bell shape. A binomial distribution with very small p (or p very close to 1) can be approximated by a normal distribution if n is very large. If n is large enough, sometimes both the normal approximation and the Poisson approximation are applicable. In that case, use of the normal approximation is generally preferable since it allows easy calculation of cumulative probabilities using tables or other technology. When dealing with extremely large samples, it becomes very tedious to calculate certain probabilities. In such circumstances, using the normal distribution to approximate the exact probabilities of success is more applicable or otherwise it would have been achieved through laborious computations. For n sufficiently large (say n > 20) and p not too close to zero or 1 (say 0.05 < p < 0.95) the distribution approximately follows the Normal distribution.

To find the binomial probabilities, this can be used as follows:

If X ~ binomial (n,p) where n > 20 and 0.05 < p < 0.95 then approximately X has the Normal distribution with mean E(X) = np

So is approximately N(0,1).

R programming will be used for calculating probabilities associated with the binomial, Poisson, and normal distributions. Using R code, it will enable me to test the input and model the output in terms of graph. The system requirement for R is to be provided an operating system platform to be able to perform any calculation.

Firstly, we are going to proceed by considering the conditions under which the discrete distribution inclines towards a normal distribution.
Generating a set of the discrete distribution so that it inclines towards a bell shape. Or simply using R by just specifying the size needed.
And lastly compare the generated distribution with the target normal distribution

Normal approximation of binomial probabilities

Let X ~ BINOM(100, 0.4).

Using R to compute Q = P(35 < X ? 45) = P(35.5 < X ? 45.5):

> diff(pbinom(c(45,35), 100, .4))

[1] -0.6894402

Whether it is for theoretical or practical purposes, Using Central Limit Theorem is more convenient to approximate the binomial probabilities.

When n is large and (np/q, nq/p) > 3, where q = 1 – p

The CLT states that, for situations where n is large,

Y ~ BINOM(n, p) is approximately NORM(? = np, ? = [np(1 – p)]1/2).

Hence, using the first expression Q = P(35 < X ? 45)

The approximation results as follows:

l ?(1.0206) – ?(–1.0206) = 0.6926

Correction for continuity adjustment will be used in order for a continuous distribution to approximate a discrete. Recall that a random variable can take all real values within a range or interval while a discrete random variable can take on only specified values. Thus, using the normal distribution to approximate the binomial, more precise approximations of the probabilities are obtained.

After applying the continuity correction to Q = P(35.5 < X ? 45.5), it results to:

?(1.1227) – ?(–0.91856) = 0.6900

We can verify the calculation using R,

> pnorm(c(1.1227))-pnorm(c(-0.91856))

[1] 0.6900547

Below an alternate R code is used to plot and illustrate the normal approximation to binomial.

Let X ~ BINOM(100, l4) and P(35 < X 45)

> pbinom(45, 100, .4) – pbinom(35, 100, .4)

[1] 0.6894402

# Normal approximation > pnorm(5/sqrt(24)) – pnorm(-5/sqrt(24))

[1] 0.6925658

# Applying Continuity Correction > pnorm(5.5/sqrt(24)) – pnorm(-4.5/sqrt(24))

[1] 0.6900506

x1=36:45

x2= c(25:35, 46:55)

x1x2= seq(25, 55, by=.01)

plot(x1x2, dnorm(x1x2, 40, sqrt(24)), type=”l”,

xlab=”x”, ylab=”Binomial Probability”)

lines(x2, dbinom(x2, 100, .4), type=”h”, col=2)

lines(x1, dbinom(x1, 100, .4), type=”h”, lwd=2)

Poisson approximation of binomial probabilities

For situations in which p is very small with large n, the Poisson distribution can be used as an approximation to the binomial distribution. The larger the n and the smaller the p, the better is the approximation. The following formula for the Poisson model is used to approximate the binomial probabilities:

A Poisson approximation can be used when n is large (n>50) and p is small (p<0.1)

Then X~Po(np) approximately.

AN EXAMPLE

The probability of a person will develop an infection even after taking a vaccine that was supposed to prevent the infection is 0.03. In a simple random sample of 200 people in a community who get vaccinated, what is the probability that six or fewer person will be infected?

Solution:

Let X be the random variable of the number of people being infected. X follows a binomial probability distribution with n=200 and p= 0.03. The probability of having six or less people getting infected is

P (X ? 6 ) =

The probability is 0.6063. Calculation can be verified using R as

> sum(dbinom(0:6, 200, 0.03))

[1] 0.6063152

Or otherwise,

> pbinom(6, 200, .03)

[1] 0.6063152

In order to avoid such tedious calculation by hand, Poisson distribution or a normal distribution can be used to approximate the binomial probability.

Poisson approximation to the binomial distribution

To use Poisson distribution as an approximation to the binomial probabilities, we can consider that the random variable X follows a Poisson distribution with rate ?=np= (200) (0.03) = 6. Now, we can calculate the probability of having six or fewer infections as

P (X ? 6) =

The results turns out to be similar as the one that has been obtained using the binomial distribution.

Calculation can be verified using R,

> ppois(6, lambda = 6)

[1] 0.6063028

It can be clearly seen that the Poisson approximation is very close to the exact probability.

The same probability can be calculated using the normal approximation. Since binomial distribution is for a discrete random variable and normal distribution for continuous, continuity correction is needed when using a normal distribution as an approximation to a discrete distribution.

For large n with np>5 and nq>5, a binomial random variable X with X?Bin(n,p) can be approximated by a normal distribution with mean = np and variance = npq. i.e. X?N(6,5.82).

The probability that there will be six or fewer cases of these incidences:

P (X?6) = P (z ? )

As it was mentioned earlier, correction for continuity adjustment is needed. So, the above expression become

P (X?6) = P (z ? )

= P (z ? )

= P (z ? )

Using R, the probability which is 0.5821 can be obtained:

> pnorm(0.2072)

[1] 0.5820732

It can be noted that the approximation used is close to the exact probability 0.6063. However, the Poisson distribution gives better approximation. But for larger sample sizes, where n is closer to 300, the normal approximation is as good as the Poisson approximation.

The normal approximation to the Poisson distribution

The normal distribution can also be used as an approximation to the Poisson distribution whenever the parameter ? is large

When ? is large (say ?>15), the normal distribution can be used as an approximation where

X~N(?, ?)

Here also a continuity correction is needed, since a continuous distribution is used to approximate a discrete one.

Example

A radioactive disintegration gives counts that follow a Poisson distribution with a mean count of 25 per second. Find probability that in a one-second interval the count is between 23 and 27 inclusive.

Solution:

Let X be the radioactive count in one-second interval, X~Po(25)

Using normal approximation, X~N(25,25)

P(23?x?27) =P(22.5

=P ( )

=P (-0.5 < Z < 0.5)

=0.383 (3 d.p)

Using R:

> pnorm(c(0.5))-pnorm(c(-0.5))

[1] 0.3829249

In this study it has been concluded that when using the normal distribution to approximate the binomial distribution, a more accurate approximations was obtained. Moreover, it turns out that as n gets larger, the Binomial distribution looks increasingly like the Normal distribution. The normal approximation to the binomial distribution is, in fact, a special case of a more general phenomenon. The importance of employing a correction for continuity adjustment has also been investigated. It has also been viewed that using R programming, more accurate outcome of the distribution are obtained. Furthermore a number of examples has also been analyzed in order to have a better perspective on the normal approximation.

Using normal distribution as an approximation can be useful, however if these conditions are not met then the approximation may not be that good in estimating the probabilities.

Models of Accounting Analysis

Historic Cost

In accounting, historic cost is the first money related quality of a financial item. Historic cost is focused around the stable measuring unit assumption. In a few circumstances, assets and liabilities may be demonstrated at their historic cost, as though there had been no change in value from the date of acquisition. The balance sheet value items may subsequently vary from the “accurate” value (WIKIPEDIA).

Principle

An accounting system in which assets are recorded on an balance sheet with the value at which they were obtained, rather than the current market value. The historic cost standard is used to get the measure of capital expended to acquire an asset, and is helpful for matching against changes in profits or expenses identifying with the asset purchased, and in addition for deciding past opportunity costs (Business Dictionary).

Impacts

Under the historical cost basis of accounting, assets and liabilities are recorded at their values when first acquired. They are not then generally restated for changes in values. Costs recorded in the Income Statement are based on the historical cost of items sold or used, rather than their replacement costs (WIKIPEDIA).

Example

The main headquarters of a company, which includes the land and building, was bought for $100,000 in 1945, and its expected market value today is $30 million. The asset is still recorded on the balance sheet at $100,000 (INVESTOPEDIA).

Current Purchasing Power Accounting

Capital maintenance in units of constant purchasing power (CMUCPP) is the International Accounting Standards Board (IASB) basic accounting model originally authorized in IFRS in 1989 as an alternative to traditional historical cost accounting (WIKIPEDIA).

Principle

Current Purchasing Power Accounting(CPPA) includes the re-statement of historical figures at current purchasing power. For this reason, historic figures must be multiple by conversion factors. The formula for the calculation of conversion component is:

Conversion factor = Price Index at the date of Conversion/Price Index at the date of item arose

Conversion factor at the beginning = Price Index at the end/Price Index at the beginning

Conversion factor at an average = Price Index at the end/Average Price Index

Conversion factor at the end = Price Index at the end/Price Index at the end

Average Price Index = Price Index at beginning + Price Index at the end/2

CPP Value = Historical value X Conversion factor (Account Managment Economics).

Impacts on Financial Statements

financial statements are ready on the basis of historical cost and a supplementary statement is ready showing historical items in terms of present value on the basis of general price index. Retail price index or wholesale price index is taken as an appropriate index for the conversion of historical cost items to show the changes in value of money. This method takes into consideration the changes in the value of items as a result of general price level, but it does not account for changes in the value of individual items (Accounting Managment).

Example

XYZ Company had a closing balance of inventory at 30 June 2012 equal to $10000. This inventory had been purchased in the last three months of the financial year. Assume the general price level index was 140 on 1 July 2011, 144 on 31 December 2011, 150 on 30 June 2012, the average for the year (July 2011-June 2012) was 145 and the average for April 2012 – June 2012 was 147. For showing updated inventory with CPPA, we will use following formula Book value of inventory X current month general price index/ average index of three months = 10000 X 150/ 147 = $ 10204 (Accounting Education).

Current Cost Accounting

Current cost accounting is a procedure of accounting that attempts to give quality of benefits on the basis of their current replacement require as opposed to the sum they were purchased for (Ask).

Principles

It influences all the records and accounting reports also their balancing items. A fundamental principle underlying the estimation of gross value added, and hence GDP, is that yield and intermediate utilization must be value at the costs present at the time the processing happens. This intimates that goods withdrawn from inventories must be value at the price prevailing at the times the goods are withdrawn and not at the costs at which they entered inventories (Glossary Of Statistical Terms).

Impacts

Accounting systems that help in the preparation of financial reports, the cost accounting systems and reports are not subject to rules and standards like the Generally Accepted Accounting Principles. As a result, there is huge variety in the cost accounting systems of the different organization and sometimes even in different parts of the same organization (WIKIPEDIA).

Advantages Of Historic Cost

Historic Cost provide straight forward procedure. It records gains until they are recognize. Historical Costing method are still using in accounting system.

Dis-Advantages Of Historic Cost

Historic Cost consider as a acquisition cost of an asset and does not recognize current market value. Historic Cost only interested in allocation of cost, not in the value of an asset. It’s neglect the current market value of the asset that may be higher or lower than its suggested. It’s gives flaws in time of inflation (Study Mode).

Comments

Historical Costing method is still using in accounting system, it is a traditional method of accounting system it is not represent the market value of items, due to which it is not a appropriate method to adopt.

Advantages Of Current Purchasing Power Accounting

Current purchasing power method uses as measuring unit.
It’s provide the calculation facilities to gain or loss in purchasing power due to holding monetary items.
In this method, historical accounts continue to be maintained because they prepared on supplementary basis.
This method intact the purchasing power of capital contributed by shareholders, so the method is important from the shareholders point of view.
This method provides reliable financial information for the management to formulate policies and plans.

Dis-Advantages Of Current Purchasing Power Accounting

This method is only consider changes in general purchasing power, it does not consider the changes in the value of individual items.
This method based on statistical index number which not used in individual firm.
It’s difficult to use suitable price index.
This method is failed to remove all defects of historical cost accounting system (Accounting Management).

Comments

Purchasing power accounting is very useful to provide financial information for management and its intact the purchasing power of capital which contributed by shareholders, its useful in inflation time so now in current time this method is very useful.

Advantages of Current Cost Accounting

This methods use present value of assets, instead of the original purchase price.
This type of accounting is addresses the difference between historical and current cost accounting system.
This method assigns higher values on the assets owned by the business.
This method also used during bankruptcy and liquidation procedures to find the total loss to the owner (Ask).

Comments

Cost accounting provides accurate situation of the connection between specific cost and specific outputs because traces resources as they moves through company. By adopting cost accounting for business, in that way we learn resources are being wasted and which resource are most profitable (Chron).

Methods of Data Collection

1. INTRODUCTION

This report consists of how data are collected and what are the methods to collect data for research. To improve a research better one or for more learning of particular thing which is to be analyzed. In this report a brief study of method of collecting data by primary data and secondary data with their classifications will be observed.

2. Methods of collecting primary data
OBSERVATION
QUESTIONNAIRE
SEMI-STRUCTURED AND IN-DEPTH INTERVIEW.

2.1 OBSERVATION

Observation means finding what people do, what they need, etc… It combines of recording, describing, analysis and interpretation of people behavior. Observation are two different types,

PARTICIPANT OBSERVATION.

In participant observation researcher will involve with subject activities and live and being a member of group. E.g. .all documentary films are all of this kind.

This type roles are:

Complete participant
Complete observer
Observer as participant
Participant as observer.

Graphical representation of participant observation researcher roles

Participant as observer complete participant

Observer as participant complete observer

STRUCTURED OBSERVATION.

As the heading its self describes about what kind of observation are done in it. It’s a structured way of dealing data collection method, which involves in high level of predetermined structured .It form only some part of data collection. Ex: A daily attendance sheet, planning sheet.

HOURS

MINUTES TAKEN

ACTIVITY

WASHING

DRESSING

EATING

MOBILITY

1

2

3

4

ACT

ADEQ

ACT

ADEQ

ACT

ADEQ

ACT

ADEQ

2.2 SEMI STRUCTURED AND IN-DEPTH INTERVIEWS

It involves in interviewing a person or on group. Where interview are classified into structured, semi-structured, unstructured interviews. In structured interviews a format of question are followed for some particular criteria to be handled, which consists of standard questions.

For semi-structured interviews it is based on optioning the customer to select their preferred section of questions. Whereas unstructured interviews deals with in depth involvement in a particular or interested area.

Interviews are done by face to face and group interviews. Face to face interviews can figure out a person behavior, but group interview show how groups are mingled together and how they differ one another.

HOW CAN THESE TYPE OF INTERVIEWS ARE USEFUL IN RESEARCH

EXPLORATORY

DESCRIPTIVE

EXPLANATORY

STRUCTURED

FREQUENT

LESS FREQUENT

SEMI-STRUCTURED

LESS FREQUENT

MORE FREQUENT

IN DEPTH

MORE FREQUENT

2.3 QUESTIONNARIE

It is a general way of collecting data, in which person is asked to answer for same set of questions in order. It is very easy to ask question for some study or research. Most of the research use questionnaire as their weapon for collecting information. This can be involved in individual level so sampling size also be larger one. An interesting one in questionnaire is modes of responding to it.

Telephonic survey.
Mail (postal) survey.
E-mail survey.

QUESTIONNARIE SELECTION CHART

2.3.1 Telephonic survey

It is a common method followed where researcher and respondent are unknown. So limited data are collected from this method. Due to limitations it restrict questionnaire format to smaller one. Question must be easier for respondent to answer quickly. Question must not be longer one which consume more time. To handle this survey a trained person must be interviewing. Answers to question can be entered directly on an excel-sheet to save time.

2.3.2 Mail (postal) survey

It is average form of survey where respondent and questionnaire cannot contact directly and without any interaction. Questioner should be preplanned about design and structure of question to be framed in such a way that respondent could answer it without neglecting any question. Questions must be in an order like easy, average, difficult, which can earn a valuable survey. Time are more valued in surveys.

2.3.3 E-mail survey

E-mail survey are most popular survey where people are gather through internet. It can be performed in two way by e-mailing or using online survey. Just as mail an e-mail can be sent to respondent for answering but they may not reply for it, due to some reasons. Online survey are better because they answer then and there so data are collected faster than mailing. Today html pages are used to frame survey questions. And exciting one for survey is Google forms which are much useful for researcher to get job done.

3. METHODS OF COLLECTING SECONDARY DATA

Collecting secondary data involves in finding publications, project and research reports, ERP/data warehouse and mining, internet/web for your necessary of research details.

3.1 PUBLICATIONS

It refers to printed media like newspapers, textbooks, magazines, journals and reports. These are otherwise known as reference material, which contains wide source of data. Researchers follow secondary data as their first priority than primary data because it will lead them to a proper or complete view of research for their respective topics. As every publications have topic specified to itself, researchers can find easily the source of topic in a systematic manner. To search these publications proper guide lines also required.

3.2 ERP/DATAWAREHOUSES AND MINING

For every organization ERP are implemented to gather information about finance, commercial, accounts, production, marketing, R&D etc…

How do ERP helps in research, since it has data stored day by day, months and yearly basis to compute as integrated one. Researcher of different phenomenon can easily track those information by authorized person of such organization for their data collection. ERP has different sectors combined for example if a researcher form financial sector comes to verify how organization development in that particular sector, he/she can collect information from ERP. Mostly these data are considered as primary data.

Data warehouses are secondary data, where large amount of data are stored. These data cannot be analyzed manually. So software for analyzing it is Data Mining Software, this will segregate all kinds of data and use statistical techniques to analyze data. Some techniques used by this software are variance analysis, cluster analysis, factor analysis, etc. It is a statistical and information technologies software. To create these software so of vendors of it are, excel miner, SPSS, SAS and SYSSTAT. Data mining is automated process where some features are selected by user.

3.3 Internet/web

Most basic way collecting secondary data is to search through web. As we know internet search topic and words related thing easily and fast where surplus amount of data are founded in thousands of websites all over the world. It includes all e-textbooks, journals, government reports. To search our results through internet search are provided those are GOOGLE, YAHOO, etc. all these search engines can show several sites but one must choose correct data related to topic of research involves. Most popular website for collecting data are Wikipedia for researcher, where note of particular topic are gives with reference site to get detailed study about research topics.

SOME OF THE IMPORTANT WEBSITES

OWNERS/SPONSORED

SITE ADDRESS

DESCRIPTION

WORLD BANK

WWW.worldbank.org

data

Reserve Bank Of India

www.rbi.org.in

Economic data, banking data

EBSCO

http://web.ebscohost.com

Research databases(paid)

ISI(Indian Statistical Institute

www.isical.ac.in/-library/

Web library

4. conclusion

From the given information we know about what are primary data and secondary data and how to collect those data from various resources. Research must be valuable one so data collection must be done enormously to predict correct result of analysis. Secondary data can be added in research reports but there must be some data which show your involvement in research process. Research is an endless process because as time changes strategy of reports containing details also vary due to respondent are not same in nature. A research about a topic gives overview, detailed and explanation according to research types. At last collection of data are most important for research because it act as proof or evidence of your valuable reports.

Table of Contents PG NO
INTRODUCTION 1
METHODS OF COLLECTING PRIMARY DATA 1
OBSERVATION 1

2.1.1 PARTICIPANT OBSERVATION

2.1.2 STRUCTURED OBSERVATION 2

2.2 SEMI STRUCTURED AND INDEPTH INTERVIEWS 3

2.3 QUESTIONNARIE 4

2.3.1 TELEPHONIC SURVEY 5

2.3.2 POSTAL SURVEY 5

2.3.3 E-MAIL SURVEY 5

METHODS OF COLLECTING SECONDARY DATA 6
PUBLICATIONS 6
ERP/DATA WAREHOUSES AND MINING 6
INTERNET/WEB 7
CONCLUSION 8

REFERENCES
PEARSON EDUCATION/ THIRD EDITION/ RESEARCH METHODS FOR BUSINESS STUDENTS/ Mark Saunders/Philip Lewis/Adrian Thorn hill
SAGE PUBLICATIONS/ESSENTIALS OF BUSINESS RESEARCH/ Jonathan Wilson.
TATA McGraw HILL/STATISTICS FOR MANAGEMENT/ G.C.BERI.

Measuring weak-form market efficiency

Measuring Weak-form Market EfficiencyABSTRACT

This paper tests weak-form efficiency in the U.S. market. Both daily and monthly returns are employed for autocorrelation analysis, variance ratio tests and delay tests. Three conclusions are reached. Firstly, security returns are predictable to some extent. While individual stock returns are weakly negatively correlated and difficult to predict, market-wide indices with outstanding recent performance show a positive autocorrelation and offer more predictable profit opportunities. Secondly, monthly returns follow random walk better than daily returns and are thus more weak-form efficient. Finally, weak-form inefficiency is not necessarily bad. Investors should be rewarded a certain degree of predictability for bearing risks.

Efficient market hypothesis (EMH), also known as “information efficiency”, refers to the extent to which stock prices incorporate all available information. The notion is important in helping investors to understand security behaviour so as to make wise investment decisions. According to Fama (1970), there are three versions of market efficiency: the weak, semistrong, and strong form. They differ with respect to the information that is incorporated in the stock prices. The weak form efficiency assumes that stock prices already incorporate all past trading information. Therefore, technical analysis on past stock prices will not be helpful in gaining abnormal returns. The semistrong form efficiency extends the information set to all publicly available information including not only past trading information but also fundamental data on firm prospects. Therefore, neither technical analysis nor fundamental analysis will be able to produce abnormal returns. Strong form efficiency differs from the above two in stating that stock prices not only reflect publicly available information but also private inside information. However, this form of market efficiency is always rejected by empirical evidence.

If weak-form efficiency holds true, the information contained in past stock price will be completely and instantly reflected in the current price. Under such condition, no pattern can be observed in stock prices. In other words, stock prices tend to follow a random walk model. Therefore, the test of weak-form market efficiency is actually a test of random walk but not vice versa. The more efficient the market is, the more random are the stock prices, and efforts by fund managers to exploit past price history will not be profitable since future prices are completely unpredictable. Therefore, measuring weak-form efficiency is crucial not only in academic research but also in practice because it affects trading strategies.

This paper primarily tests the weak-form efficiency for three stocks-Faro Technologies Inc. (FARO), FEI Company (FEIC) and Fidelity Southern Corporation (LION) and two decile indices-the NYSE/AMEX/NASDAQ Index capitalisation based Deciles 1 and 10 (NAN D1 and NAN D10). Both daily and monthly data are employed here to detect any violation of the random walk hypothesis.

The remainder of the paper is structured in the following way. Section I provides a brief introduction of the three firms and two decile indices. Section II describes the data and discusses the methodology used. Section III presents descriptive statistics. Section IV is the result based on empirical analysis. Finally, section V concludes the paper.

I. The Companies[1]

A. Faro Technologies Inc (FARO)

FARO Technologies is an instrument company whose principle activities include design and develop portable 3-D electronic systems for industrial applications in the manufacturing system. The company’s principal products include the Faro Arm, Faro Scan Arm and Faro Gage articulated measuring devices. It mainly operates in the United States and Europe.

B. FEI Company (FEI)

FEI is a leading scientific instruments company which develops and manufactures diversified semiconductor equipments including electron microscopes and beam systems. It operates in four segments: NanoElectronics, NanoResearch and Industry, NanoBiology and Service and Components. With a 60-year history, it now has approximately 1800 employees and sells products to more than 50 countries around the world.

C. Fidelity Southern Corp. (LION)

Fidelity Southern Corp. is one of the largest community banks in metro Atlanta which provides a wide range of financial services including commercial and mortgage services to both corporate and personal customers. It also provides international trade services, trust services, credit card loans, and merchant services. The company provides financial products and services for business and retail customers primarily through branches and via internet.

D. NYSE/AMEX/NASDAQ Index

It is an index taken from the Center for Research in Security Prices (CRSP) which includes all common stocks listed on the NYSE, Amex, and NASDAQ National Market. The index is constructed by ranking all NYSE companies according to their market capitalization in the first place. They are then divided into 10 decile portfolios. Amex and NASDAQ stocks are then placed into the deciles based on NYSE breakpoints. The smallest and the largest firms based on market capitalization are placed into Decile 1 and Decile 10, respectively.

II. Data and Methodology

A. Data

Data for the three stocks and two decile indices in our study are all obtained from the Center for Research in Securities Prices database (CRSP) on both daily and monthly basis from January 2000 to December 2005. Returns are then computed on both basis, generating a total of 1507 daily observations and 71 monthly observations. The NYSE/AMEX/NASDAQ Index is CRSP Capitalisation-based so that Decile 1 and 10 represent the smallest and largest firms, respectively, based on market capitalisation. In addition, The Standard and Poors 500 Index (S&P 500) is used as a proxy for the market index. It is a valued-weighted index which incorporates the largest 500 stocks in US market. For comparison purposes, both continuously compounded (log) returns and simple returns are reported, although the analysis is based on the result of the first one.

B. Methods

B.1. Autocorrelation Tests

One of the most intuitive and simple tests of random walk is to test for serial dependence, i.e. autocorrelation. The autocorrelation is a time-series phenomenon, which implies the serial correlation between certain lagged values in a time series. The first-order autocorrelation, for instance, indicates to what extent neighboring observations are correlated. The autocorrelation test is always used to test RW3, which is a less restrictive version of random walk model, allowing the existence of dependent but uncorrelated increments in return data. The formula of autocorrelation at lag k is given by:

(1) where is the autocorrelation at lag ; is the log-return on stock at time; and is the log-return on stock at time. A greater than zero indicates a positive serial correlation whereas a less than zero indicates a negative serial correlation. Both positive and negative autocorrelation represent departures from the random walk model. If is significantly different from zero, the null hypothesis of a random walk is rejected.

The autocorrelation coefficients up to 5 lags for daily data and 3 lags for monthly data are reported in our test. Results of the Ljung-Box test for all lags up to the above mentioned for both daily and monthly data are also reported. The Ljung-Box test is a more powerful test by summing the squared autocorrelations. It provides evidence for whether departure for zero autocorrelation is observed at all lags up to certain lags in either direction. The Q-statistic up to a certain lag m is given by:

(2)

B.2. Variance Ratio Tests

We follow Lo and MacKinlay’s (1988) single variance ratio (VR) test in our study. The test is based on a very important assumption of random walk that variance of increments is a linear function of the time interval. In other words, if the random walk holds, the variance of the qth differed value should be equal to q times the variance of the first differed value. For example, the variance of a two-period return should be equal to twice the variance of the one-period return. According to its definition, the formula of variance ratio is denoted by:

(3) where q is any positive integer. Under the null hypothesis of a random walk, VR(q) should be equal to one at all lags. If VR(q) is greater than one, there is positive serial correlation which indicates a persistence in prices, corresponding to the momentum effect. If VR(q) is less than one, there is negative serial correlation which indicates a reversal in prices, corresponding to the mean-reverting process.

Note that the above two test are also tests of how stock prices react to publicly available information in the past. If market efficiency holds true, information from past prices should be immediately and fully reflected in the current stock price. Therefore, future stock price change conditioned on past prices should be equal to zero.

B.3. Griffin-Kelly-Nardari DELAY Tests

As defined by Griffin, Kelly and Nardari (2007), “delay is a measure of sensitivity of current returns to past market-wide information”.[2] Speaking differently, delay measures how quickly stock returns can react to market returns. The logic behind this is that a stock which is slow to incorporate market information is less efficient than a stock which responds quickly to market movements.

S&P 500 index is employed in delay test to examine the sensitivity of stock returns to market information. For each stock and decile index, both restricted and unrestricted models are estimated from January 2000 to December 2005. The unrestricted model is given by:

(4) where is the log-return on stock i at time t; is the market log-return (return for S&P 500 index) at time t; is the lagged market return; is the coefficient on the lagged market return; and is the lag which is 1, 2, 3, 4 for the daily data and 1, 2, 3 for the monthly data. The restricted model is as follows which sets all to be zero:

(5) Delay is then calculated based on adjusted R-squares from above regressions as follows:

(6) An alternative scaled measure of delay is given by:

(7) Both measures are reported in a way that the larger the calculated delay value, the more return variation is explained by lagged market returns and thus the more delayed response to the market information.

III. Descriptive Statistics

A. Daily frequencies

Table I shows the summary statistic of daily returns for the three stocks and two decile indices. The highest mean return is for FARO (0.0012), whereas the lowest mean return is for NAN D10 (0.0000). In terms of median return, NAN D1 (0.0015) outperforms all the other stocks. Both the highest maximum return and the lowest minimum return (0.2998 and -0.2184, respectively) are for FARO, corresponding to its highest standard deviation (0.0485) among all, indicating that FARO is the most volatile in returns. On the other hand, both the lowest maximum return and highest minimum return (0.0543 and -0.0675, respectively) are for NAN D10. However NAN D10 is only the second least volatile, while the lowest standard deviation is for NAN D1 (0.0108). Figure 1 and 2 presents the price level of the most and least volatile index (stock). All the above observations remain true if we change from log-return basis to a simple return basis.

In terms of the degree of asymmetry of the return distributions, all stocks and indices are positively skewed, with the only exception of NAN D1. The positive skewness implies that more extreme values are in the right tail of the distribution, i.e. stocks are more likely to have times when performance is extremely good. On the other hand, NAN D1 is slightly negatively skewed, which means that returns are more likely to be lower that what is expected by normal distribution. In measuring the “peakedness” of return distributions, positive excess kurtosis is observed in all stocks and indices, also known as a leptokurtic distribution, which means that returns either cluster around the mean or disperse in the two ends of the distribution. All the above observations can be used to conclusively reject the null hypothesis that daily returns are normally distributed. What’ more, results from Jarque-Bera test provide supportive evidence for rejection of the normality hypothesis at all significant levels for all stocks and indices.

B. Monthly frequencies

Descriptive statistics of monthly returns are likewise presented in Table II. Most of the above conclusions reached for daily returns are also valid in the context of monthly returns. In other words, what is the highest (lowest) value for daily returns is also the highest (lowest) for monthly returns in most cases. The only exceptions are for the highest value in median returns and the lowest value and standard deviation in minimum returns. In this situation, NAN D10 (0.0460) and FARO (0.1944) have the least and most dispersion according to their standard deviations, compared with NAN D1 and FARO in daily case. From above observation, we can see that decile indices are more stable than individual stocks in terms of returns. What’s more, monthly returns have larger magnitude in most values than daily returns.

Coming to the measurement of asymmetry and peakedness of return distributions, only NAN D10 (-0.4531) is negatively skewed. However, the degree of skewness is not far from 0. Other stocks and index are all positively skewed with both FEIC (0.0395) and LION (0.0320) having a skewness value very close to 0. Almost all stocks and index have a degree of kurtosis similar to that of normal distribution, except that NAN D1 (8.6623) is highly peaked. This is also consistent with the results of JB p-values, based on which we conclude that FEIC, LION and NAN D10 are approximately normal because we fail to reject the hypothesis that they are normally distributed at 5% or higher levels (see Figure 3 and 4 for reference). However when simple return basis is used, FEIC is no longer normally distributed even at the 1% significant level. Except this, using simple return produces similar results.

IV. Results

A. Autocorrelation Tests

A.1. Tests for Log-Returns

The results of autocorrelation tests for up to 5 lags of daily log-returns and up to 3 lags of monthly log-returns for three stocks and two decile indices from January 2000 to December 2005 are summarised in Table III. Both the autocorrelation (AC) and partial autocorrelation (PAC) are examined in our tests.

As is shown in Panel A, all 5 lags of FARO, FEIC and NAN D10 for both AC and PAC are insignificant at 5% level, except for the fourth-order PAC coefficient of FARO (-0.052), which is slightly negatively significant. On the contrary, NAN D1 has significant positive AC and PAC at almost all lags except in the fourth order, its PAC (0.050) is barely within the 5% significance level. The significant AC and PAC coefficients reject the null hypothesis of no serial correlation in NAN D1, thereby rejecting the weak-form efficiency. In terms of LION, significant negative autocorrelation coefficients are only observed in the first two orders and its higher-order coefficients are not statistically significant. Besides that, we find that all the stocks and indices have negative autocorrelation coefficients at most of their lags, with the only exception of NAN D1, whose coefficients are all positive. The strictly positive AC and PAC indicates persistence in returns, i.e. a momentum effect for NAN D1, which means that good or bad performances in the past tend to continue over time.

We also present the Ljung-Box (L-B) test statistic in order to see whether autocorrelation coefficients up to a specific lag are jointly significant. Since RW1 implies all autocorrelations are zero, the L-B test is more powerful because it tests the joint hypothesis. As is shown in the table, both LION and NAN D1 have significant Q values in all lags at all levels, while none of FARO, FEIC and NAN D10 has significant Q values.

Based on above daily observations, we may conclude that the null hypothesis of no serial correlation is rejected at all levels for LION and NAN D1, but the null hypothesis cannot be rejected at either 5% level or 10% level for FARO, FEIC and NAN D10. This means that both LION and NAN D1 are weak-form inefficient. By looking at their past performance, we find that while NAN D1 outperformed the market in sample period, LION performed badly in the same period. Therefore, it seems that stocks or indices with best and worst recent performance have stronger autocorrelation. In particular, LION shows a positive autocorrelation in returns, suggesting that market-wide indices with outstanding recent performance have momentum in returns over short periods, which offer predictable opportunities to investors.

When monthly returns are employed, no single stock or index has significant AC or PAC in any lag reported at 5% level. It is in contrast with daily returns, which means that monthly returns follow a random walk better than daily returns. More powerful L-B test confirms our conclusion by showing that Q statistics for all stocks and indices are statistically insignificant at either 5% or 10% level. Therefore, the L-B null hypothesis can be conclusively rejected for all stocks and indices up to 3 lags. When compared with daily returns, monthly returns seem to follow random walk better and are thus more weak-form efficient.

A.2. Tests for Squared Log-Returns

Even when returns are not correlated, their volatility may be correlated. Therefore, it is necessary for us to expand the study from returns to variances of returns. Squared log-returns and absolute value of log-returns are measures of variances and are thus useful in studying the serial dependence of return volatility. The results of autocorrelation analysis for daily squared log-returns for all three stocks and two decile indices are likewise reported in Table IV.

In contrast to the results for log-returns, coefficients for FEIC, LION, NAN D1 and NAN D10 are significantly different from zero, except for the forth-order PAC coefficient (0.025) for FEIC, the fifth-order PAC coefficient for LION (-0.047) and third- and forth-order PAC coefficient for NAN D1 (-0.020 and -0.014, respectively). FARO has significant positive AC and PAC at the first lag and a significant AC at the third lag. The L-B test provides stronger evidence against the null hypothesis that sum of the squared autocorrelations up to 5 lags is zero for all stocks and indices at all significant levels, based on which we confirm our result that squared log-returns do not follow a random walk. Another contrasting result with that of log-returns is that almost all the autocorrelation coefficients are positive, indicating a stronger positive serial dependence in squared log-returns.

In terms of monthly data, only FEIC and NAN D10 have significant positive third-order AC and PAC estimates. Other stocks and indices have coefficients not significantly different from zero. The result is supported by Ljung-Box test statistics showing that Q values are only statistically significant in the third lag for both FEIC and NAN D10. This is consistent with the result reached for log-returns above, which says that monthly returns appear to be more random than daily returns.

A.3. Tests for the Absolute Values of Log-Returns

Table V provides autocorrelation results for the absolute value of log-returns in similar manner. However, as will be discussed below, the results are even more contrasting than that in Table IV.

In Panel A, all the stocks and indices have significant positive serial correlation while insignificant PAC estimates are only displayed in lag 5 for both FARO and LION. Supporting above result, Q values provide evidence against the null hypothesis of no autocorrelation. Therefore, absolute value of daily log-returns exhibit stronger serial dependence than in Table III and IV, and autocorrelations are strictly positive for all stocks and indices. Coming to the absolute value of monthly log-returns, only FEIC displays significant individual and joint serial correlation. NAN D1 also displays a significant Q value in lag 2 at 5% level, but it is insignificant at 1% level.

Based on the above evidence, two consistent conclusions can be made at this point. First of all, by changing ingredients in our test from log-returns to squared log-returns and absolute value of log-returns, more positive serial correlation can be observed, especially in daily data. Therefore, return variances are more correlated. Secondly, monthly returns tend to follow a random walk model better than daily returns.

A.4. Correlation Matrix of Stocks and Indices

Table VI presents the correlation matrix for all stocks and indices. As is shown in Panel A for daily result, all of the correlations are positive, ranging from 0.0551 (LION-FARO) to 0.5299 (NAN D10-FEIC). Within individual stocks, correlation coefficients do not differ a lot. The highest correlation is between FEIC and FARO with only 0.1214, indicating a fairly weak relationship between individual stocks returns. However, in terms of stock-index relationships, they differ drastically from 0.0638 (NAN D10-FARO) to 0.5299 (NAN D10-FEIC). While the positive correlation implies that the three stocks follow the indices in the same direction, the extent to which they will move with the indices is quite different, indicating different levels of risk with regard to different stock. Finally, we find that the correlation between NAN D10 and NAN D1 is the second highest at 0.5052.

Panel B provides the correlation matrix for monthly data. Similar to results for daily data, negative correlation is not observed. The highest correlation attributes to that between NAN D10 and FEIC (0.7109) once again, but the lowest is between LION and FEIC (0.1146) this time. Compared with results in Panel A, correlation within individual stocks is slightly higher on average. The improvement in correlation is even more obvious between stocks and indices. It implies that stock prices can change dramatically from day to day, but they tend to follow the movement of indices in a longer horizon. Finally, the correlation between two indices is once again the second highest at 0.5116, following that between NAN D10 and FEIC. It is also found that the correlation between indices improves only marginally when daily data are replaced by monthly data, indicating a relatively stable relationship between indices.

B. Variance Ratio Tests

The results of variance ratio tests are presented in Table VII for each of the three stocks and two decile indices. The test is designed to test for the null hypothesis of a random walk under both homoskedasticity and heteroskedasticity. Since the violation of a random walk can result either from changing variance, i.e. heteroskedasticity, or autocorrelation in returns, the test can help to discriminate reasons for deviation to some extent. The lag orders are 2, 4, 8 and 16. In Table VII, the variance ratio (VR(q)), the homoskedastic-consistent statistics (Z(q)) and the heteroskedastic-consistent statistics (Z*(q)) are presented for each lag.

As is pointed out by Lo and MacKinlay (1988), the variance ratio statistic VR(2) is equal to one plus the first-order correlation coefficient. Since all the autocorrelations are zero under RW1, VR(2) should equal one. The conclusion can be generalised further to state that for all q, VR(q) should equal one.

According to the first Panel in Table VII, of all stocks and indices, only LION and NAN D1 have variance ratios that are significantly different from one at all lags. Therefore, the null hypothesis of a random walk under both homoskedasticity and heteroskedasticity is rejected for LION and NAN D1, and thus they are not weak-form efficient because of autocorrelations. In terms of FARO, the null hypothesis of a homoskedastic random walk is rejected, while the hypothesis of a heteroskedastic random walk is not. This implies that the rejection of random walk under homoskedasticity could partly result from, if not entirely due to heteroskedasticity. On the other hand, both FEIC and NAN D10 follow random walk and turn out to be efficient in weak form, corresponding exactly to the autocorrelation results reached before in Table III.

Panel B shows that when monthly data are used, the null hypothesis under both forms of random walk can only be rejected for FARO. As for FEIC, the random walk null hypothesis is rejected under homoskedasticity, but not under heteroskedasticity, indicating that rejection is not due to changing variances because Z*(q) is heteroskedasticity-consistent.

As is shown in Panel A for daily data, all individual stocks have variance ratios less than one, implying negative autocorrelation. However, the autocorrelation for stocks is statistically insignificant except for LION. On the other hand, variance ratios for NAN D1 are greater than one and increasing in q. The above finding provides supplementary evidence to the results of autocorrelation tests. As Table III shows, NAN D1 has positive autocorrelation coefficients in all lags, suggesting a momentum effect in multiperiod returns. Both findings appear to be well supported by empirical evidence. While daily returns of individual stocks seem to be weakly negatively correlated (French and Roll (1986)), returns for best performing market indices such as NAN D1 show strong positive autocorrelation (Campbell, Lo, and MacKinlay (1997)). The fact that individual stocks have statistically insignificant autocorrelations is mainly due to the specific noise contained in company information, which makes individual security returns unpredictable. On the contrary, while the positive serial correlation for NAN D1 violates the random walk, such deviation provides investors with confidence to forecast future prices and reliability to make profits.

C. Griffin, Kelly and Nardari DELAY Tests

The results of delay test for the three stocks and two decile indices over the January 2000 to December 2005 period are summarised in Table VIII. We use lag 1, 2, 3, 4 for the daily data and 1, 2, 3 for the monthly data.

As is presented in Panel A for daily returns, Delay_1 value for NAN D10 is close to zero and hence not significant, while NAN D1 has the highest delay among all stocks and indices. The rank of delay within individual stocks seems to have a positive relationship between size and delay value, by showing that delay of LION, the stock with smallest market capitalization is lowest, while the delay of FEIC, the stock with largest market capitalization is highest. It seems to contradict with the Griffin, Kelly and Nardari (2006) study, which says that there is an inverse relationship between size and delay. One possible explanation for that is that delay calculated by daily data on individual firms is noisy.

The scaled measure Delay_2 produces consistent conclusion but with higher magnitude in values. Delay_2 values are very different from zero for FARO, FEIC, LION and NAN D1. The largest increase in value is seen in FARO from 0.0067 for Delay_1 to 0.7901 for Delay_2. Therefore, Griffin, Kelly and Nardari delay measure is preferable, because the scaled version can result in large values without economic significance.

As is displayed in Panel B, employing monthly data also leads to higher Delay_1 values, indicating that more variation of monthly returns are captured by lagged market returns and hence monthly returns are not as sensitive as daily returns to market-wide news. However, an inverse relationship is found this time between delay and market value of individual stocks. Therefore, monthly data provides consistent result to support Griffin, Kelly and Nardari (2006) result as one would normally expect larger stocks to be more efficient in responding to market. Similar to the result for daily data, scaled measure once again produces higher values than its alternative but it provides the same results.

V. Conclusion

The main objective of this paper is to test weak-form efficiency in the U.S. market. As is found by selected tests, NAN D10 and FEIC provide the most consistent evidence to show weak-form efficiency, while the deviation from random walk is suggested for other stocks and indices, especially for NAN D1 and LION. It indicates that security returns are predictable to some degree, especially for those having best and worst recent performance.

The three autocorrelation tests provide different results in terms of daily returns. While the null hypothesis of random walk is rejected for NAN D1 and LION based on log-returns, it is rejected for all stocks and indices based on both squared and absolute value of log-returns, indicating that return variances are more correlated. On the other hand, results in the context of monthly returns are consistent. Monthly returns follow a random walk much better than daily returns in all three tests. Most evidently, the autocorrelation test fails to reject the presence of random walk for all stocks and indices when monthly log-returns are employed.

The variance ratio tests provide supportive evidence for autocorrelation tests. Both tests find that in terms of daily return, NAN D1 and LION show a significant return dependence. In particular, variance ratios for NAN D1 are all above one, corresponding to its positive AC and PAC coefficients, thus implying positive autocorrelation in returns. What’s more, individual stocks have variance ratios less than one with FEIC and FARO both being insignificant. The above evidence conclusively suggest that while individual stock returns are weakly negatively related and difficult to predict, market-wide indices with outstanding recent performance such as NAN D1 tend to show a stronger positive serial correlation and thus offer predictable profit opportunities.

The evidence regarding delay tests is consistent with earlier findings to a large extent. NAN D1 has highest delay in both daily and monthly cases, implying an inefficient response to market news. In the context of monthly log-returns, delay values for individual stocks rank inversely based on market capitalisation with larger cap stocks having lower delay, suggesting that small stocks do not capture past public information quickly and are thus inefficient.

Finally, deviation from a random walk model and thus being weak-form inefficiency is not necessarily bad. In fact, investors should be rewarded a certain degree of predictability for bearing risks. Therefore, future research could be done by incorporating risk into the model.

[1] Company information is mainly obtained from Thomson One Banker database.

[2] Griffin, John M., Patrick J. Kelly, and Federico Nardari, 2006, Measuring short-term international stock market efficiency, Working Paper

Multivariate Multilevel Modeling

Literature Review

This chapter tying up the various similar studies related to modeling responses multivariately in a multilevel frame work. As a start, this chapter begins by laying out the recent history of univariate techniques for analyzing categorical data in a multilevel context. Then it gradually presents the literature available on fitting multivariate multilevel models for categorical and continuous data. More over this chapter reviews the evidence for imputing missing values for partially observed multivariate multilevel data sets.

The Nature of Multivariate Multilevel models

A multivariate multilevel model can be considered as a collection of multiple dependent variables in a hierarchical nature. Though the multivariate analysis increases the complexity in a multilevel context, it is an essential tool which facilitates to carry out a single test of the joint effects of some explanatory variables on several dependent variables (Snijders & Bosker (2000). These models have the power of increasing the construct validity of the analysis for complex concepts in the real world. Consider a study on school effectiveness which can be measured on three different output variables math achievement, reading proficiency and well-being at school. These data are collected on students those who are clustered within schools by implying a hierarchical nature. Although it is certainly possible to handle three outcomes separately, it is unable to show the overall picture about school effectiveness. Therefore multivariate analysis would be more preferable in these types of scenarios since it has the capability of decreasing the type 1 error and increasing the statistical power (Maeyer, Rymenans, Petegem and Bergh) (Draft).

Hierarchical natures of multivariate models are not like as the univariate response models. Let us focus on above example; it implies a two level multivariate model. But in reality it has three levels. In this case, the measurements are the level 1 units, the students the level 2 units and the schools the level three units.

Importance of Multivariate Multilevel Modeling

Multivariate multilevel data structures may itself present a greater complexity as it leads to focus the multilevel effects together with the multivariate context. Therefore the traditional statistical techniques would fail to face these kinds of areas since it can decrease the statistical efficiency by producing overestimated standard errors. On the other hand violation of independence assumption may cause to under estimate the standard errors of regression coefficients. Therefore multivariate multilevel approaches play an important role to get rid of these kinds of situations by allowing variation at different levels to be estimated. Furthermore Goldstein (1999) has shown that clustering provides accurate standard errors, confidence intervals and significance tests.

Some amount of articles have been published on multilevel modeling based on a single response context. Multivariate multilevel concept comes into the field of statistics during the past few years. When people want to identify the effect of set of explanatory variables on a set of dependent variables and by considering these effects separately on response variables, then if it shows a considerable difference among those effects then it can be handled only by means of a multivariate analysis (Snijders & Bosker, 2000).

Software for Multivariate Multilevel Modeling

In the past decades, due to the unavailability of the software for fitting multivariate multilevel data some researchers tend to use manual methods such as EM Algorithm (Kang et al., 1991). As a result of developing the technical environment, the software such as STATA, SAS and S plus are emerged in to the Statistical field by providing facilitates to handle the multilevel data. But none of those packages have a capability of fitting multivariate multilevel data. However there is evidence in the literature that nonlinear multivariate multilevel model can be fitted using packages such as GLLAMM (Rabe-Hesketh, Pickles and Skrondal, 2001) and aML (Lillard and Panis, 2000). But it was not flexible to handle this software.

Therefore MlwiN software which has become the under development since late 1980’s was modified at the University of Bristol in UK in order to fulfill that requirement. However, the use of MlwiN for fitting multivariate multilevel models has been challenged by Goldstein, Carpenter and Browne (2014) who concluded that MlwiN was useful if only when fitting the model without imputing for the missing values. However REALCOM software was then came into the field of Statistics and provided the flexibility to impute the missing values in the MLwiN environment.

MLwiN is a modified version of DOS MLn program which uses a command driven interface. MLwiN provides flexibility to fitting very large and complex models using both frequentist and Bayesian estimation along with the missing value imputation in a user friendly interface. Some particular advanced features which are not available in the other packages are included in this software.

Univariate Multilevel Modeling vs. Multivariate Multilevel Modeling

In general, data are often collected on multiple correlated outcomes. One major theoretical issue that has dominated the field for many years is modeling the association between risk factors and each outcome in a separate model. It may cause to statistically inefficient since it ignores outcome correlations and common predictor effects (Oman, Kamal and Ambler) (unpublished)

Therefore most of the researches tend to include all related outcomes in a single regression model within a multivariate outcome framework rather than univariate. Recently investigators have examined the comparison between Univariate and Multivariate outcomes and they have proven that Multivariate models would be preferable than several univariate models.

According to the Griffiths, Brown and Smith (2004), they conducted a study to compare univariate and multivariate multilevel models for repeated measures of use of antenatal care in Uttar Pradesh, India. In here, they examined many factors which may have a relationship to the mother’s decision to use ante-natal care services for a particular pregnancy. For that they compared Univariate multilevel logistic regression model vs. Multivariate multilevel logistic regression model. However as a result of fitting univariate models, model assumptions became violated and couldn’t get stable parameter estimates. Therefore they preferred the multivariate context rather than the univariate context after performing the analysis.

Generalized Cochran Mantel Haenzel Tests for Checking Association of Multilevel Categorical Data.

The history of arising the concepts related to Generalized Cochran Mantel Haenzel was streaming to the late 1950’s. Cochran (1958), one of a great Statistician has firstly introduced a test to identify the independence of multiple 2 ? 2 tables by extending the general chi-square test for independence of a single 2-way table. In here, the each table consists of one or two additional variables for higher levels to detect the multilevel nature. The test statistic is based on the row totals of each table. The assumption behind is that the cell counts have binomial distribution.

As an extension to Cochran’s work, Mantel and Haenzel (1959) extended the Cochran’s test statistic for both row and column totals by assuming the cell counts of each table follows a hypergeommetric distribution. Since Cochran Mantel Hanzel (CMH) statistic has a major limitation on binary data, Landis et al (1978) generalized this test into handle more than two levels. However there is a major drawback of the Generalized Cochran Mantel Haenzel (GCMH) test. This test was unable to handle clustered correlated categorical data. Liang (1985) was proposed a test statistic for get rid of this problem. However that test statistic itself had major problems and it was fail to use.

As development of the statistics field, a need for a test statistic capable of handling correlated data and variables with higher levels arouse. Zhang and Boos (1995) coming in to the field and introduced three test statistics TEL TP and TU as a solution to the above problems. However among these three test statistics TP and TU are preferred to TEL since these two use the individual subjects as the primary sampling units while TEL use the strata as the primary sampling unit (De Silva and Sooriyarachchi, 2012).

Furthermore, by a simulation study TP shows better performance than TE by maintaining its error values even when the strata are small and it uses the pooled estimators for variance. Therefore it provides a guideline to select TP as the most suitable statistic to perform this study. De Silva and Sooriyarachchi (2012) developed a R program to carry out this test.

Missing Value Imputation in Multivariate Multilevel Framework

The problem of having missing values is often arising in real world datasets. However it contains little or no information about the missing data mechanism (MDM). Therefore modeling incomplete data is a very difficult task and may provide bias results. Therefore this major problem address to a need of a proper mechanism to check the missingness. As a solution to that, Rubin (1976) presented three possible ways of arising misingness. These are classified as Missing At Random (MAR), Missing Completely At Random (MCAR) and Missing Not At Random (MNAR). According to the Sterne et. Al (2009), missing value imputation is necessary under the assumption of missing at random. However, it can also be done under the case missing complete at random. On nowadays most statistical packages have the capability of identifying the type of missingness.

After identifying the type of missingness, the missing value imputation comes into the field and it requires a statistical package to perform this. Since the missing value imputation in a hierarchical nature is little bit more advanced and it cannot be done using usual statistical packages such as SPSS, SAS and R etc. Therefore Carpenter et. al (2009), developed the REALCOM software to perform this task. However latter version of REALCOM was not deal with multilevel data in a multivariate context. Therefore the macros related to perform this task was recently developed by the Bristol University team in order to facilitate under this case.

Estimation Procedure

The estimation procedures for multilevel modeling are starting late 1980’s. However For parameter estimation using Maximum Likelihood Method, an iterative procedure called EM algorithm was used by early statisticians (Raudenbush, Rowan and Kang, 1991). Later on the program HLM was developed to perform this algorithm.

The most operational procedures for estimating multivariate multilevel models in the presence of Normal responses are Iterative Generalized Least Squares (IGLS), Reweighted IGLS (RIGLS) and Marginal Quasi Likelihood (MQL) while for discrete responses are MQL and Penalized Quasi Likelihood (PQL). According to Rasbash, Steele, Browne and Goldstein (2004) all of these methods are implemented in MLwiN along with including first order or second order Taylor Series expansions. However since these methods are likelihood based frequentist methods they tend to overestimate the precision.

Therefore more recently the methods which are implemented in a Bayesian framework using Marcov Chain Monte Carlo methods (Brooks, 1998) also used for parameter estimation which allows capability to use informative prior distributions. These MCMC estimates executed in MLwiN provides consistent estimates though they require a large number of simulations to control of having highly correlated chains.

Previous researches conducted using Univariate and Multivariate Multilevel Models

Univariate multilevel logit models

Before take a look at to the literature on multivariate multilevel analysis, the literature of univariate multilevel analysis is also be necessary to concerned since this thesis is based on some univariate multilevel models prior to fit multivariate multilevel models.

In the past decades, many social Scientists used to apply multilevel models for binay data. Therefore it is very important to review how they have implemented their work with less technology. As a aim of that, Guo and Zhao (2000) was able to do a review of the methodologies, hypothesis testing and hierarchical nature of the data involve of past literature. Also they conducted two examples for justify their results. First of all they made a comparison between estimates obtained from MQL and PQL methods which was implemented by MLn and the GLIMMIX method implemented by SAS by using examples. They have shown that the differences in PQL 1 and PQL 2 are small when fitting binary logistic models. Furthermore, they have shown that PQL- 1 and PQL-2 and GLIMMIX are probable to be satisfactory for most of the past studies undertaken in social sciences.

Noortgate, Boeck and Meulders (2003) uses multilevel binary logit models for the purpose of analyzing Item Response Theory (IRT) models. For that they carried out an assessment of the nine achievement targets for reading comprehension of students in primary schools in Belgium. They performed a multilevel analyses using the cross-classified logistic multilevel models and used the GLIMMIX macro from SAS, as well as the MLwiN software. However they found that there were some convergence problems arisen by using PQL methods in MLwiN. Therefore they used SAS to carryout analysis. Furthermore they have shown that the cross-classification multilevel logistic model is a very flexible to handle IRT data and the parameters can still be estimated even with the presence of unbalanced data.

Multivariate Multilevel Models

In the past two decades a very few of researches have sought to fit the multivariate multilevel models to the real world scenarios. Among those also all most all the researches trying to focus basically in educational sectors as well as socio economic sectors. None of them were able to focus these into the medical scenarios. However lack of multivariate multilevel analysis which presents in the field of health and medical sciences this chapter consists of the literatures of multivariate multilevel models in other fields.

According to the previous studies of education, Xin Ma (2001) examined the association between the academic achievements and the background of students in Canada by considering three levels of interest. For that the three level Hierarchical Linear Model (HLM) was developed in order to achieve his goals. This work allows him to draw the conclusions that both students and schools were differentially successful in different subject areas and it was more obvious among students than among schools. However the success of this study is based on some strong assumptions about the priors of student’s cognitive skills.

Exclusive of the field of education Raudenbush, Johnson and Sampson (2003) carried out a study in Chicago to determine the criminal behavior at person level as well as at neighborhood level with respect to some personal characteristics. For this purpose they use a Rasch model with random effects by assuming conditional independence along with the additives.

Moreover, Yang, Goldstein, Browne and Woodhouse (2002) developed a multivariate multilevel analysis of analyzing examination results via a series of models of increasing complexity. They used examination results of two mathematics examinations in England in 1997 and analyzed them at individual and institutional level with respect to some students features. By starting from a simpler model of multivariate normality without considering the institutional random effects, they gradually increased the complexity of the model by adding institutional levels together with the multivariate responses. When closely looked at, there work shows that the choice of subject is strongly associated with the performance.

Along with this growth of applications of multivariate multilevel models, researches may tend to apply those in to the other fields such as Forestry etc. Hall and Clutter (2004) presented a study regarding modeling the growth and yield in forestry based on the slash pine in U.S.A. In their work, they developed a methodology to fit nonlinear mixed effect model in a multivariate multilevel frame work in order to identify the effects of the several plot-level timber quantity characteristics for the yield of timber volume.

In addition to that they also developed a methodology to produce predictions and prediction intervals from those models. Then by using their developments they have predicted timber growth and yield at the plot individual and population level.

Grilli and Rampichini (2003) carried out a study to model ordinal response variables according to the students rating data which were obtained from a survey of course quality carried out by the University of Florence in 2000-2001 academic years. For that they developed an alternative specification to the multivariate multilevel probit ordinal response models by relying on the fact that responses may be viewed as an additional dummy bottom level variable. However they not yet assess the efficiency of that method since they were not implemented it using standard software.

When considering the evidences of the recent applications of these models the literature shows that Goldstein and Kounali (2009) recently conducted a study on child hood growth with respect to the collection of growth measurements and adult characteristics. For that they extended the latent normal model for multilevel data with mixed response types to the ordinal categorical responses with having multiple categories for covariates. Since data consists of counts they gradually developed the model by starting a model with assuming a Poison distribution. However since the data are not follow exactly a Poisson distribution they treated the counts as an ordered categories to get rid of that problem.

Frank, Cerda and Rendon (2007) did a study to identify whether the residential location have an impact to the health risk behaviors of Latino immigrants as they are increasing substantially in every year. For that they used a Multivariate Multilevel Rasch model for the data obtained by Los Angelis family and neighborhood survey based on two indices of health risk behaviors along with their use of drugs and participation for risk based activities. They starting this attempt by modeling the behavior of adolescents as a function of the characteristics related to both individual and neighborhood .According to the study they found that there is an association between increased health risk behaviors with the above country average levels of Latinos and poverty particularly for those who born in U.S.A.

Another application of multivariate multilevel models was carried out Subramanian, Kim and Kawachi (2005) in U.S.A. Their main aim was to identify the individual and community level factors for the health and happiness of individuals. For that they performed a multivariate multilevel regression analysis on the data obtained by a survey which was held on 2000. Their findings reflect that those who have poor health and unhappiness have a high relationship with the individual level covariates

By looking at the available literature, it can be seen that there are some amount of studies conducted on education and social sciences in other countries but none of the studies conducted regarding health and medical sciences. Therefore it is essential to perform a study by analyzing the mortality rates of some killing diseases which are spread in worldwide to understand risk factors and patterns associated with these diseases in order to provide better insights about the disease to the public as well as to the responsibly policy makers.